System Architecture
Architecture Overview
The DOCiD™ (Digital Object Container Identifier) system is a sophisticated platform designed to serve as a comprehensive publication and document identifier management system. The system is architected with a focus on persistent identifier (PID) services, metadata management, and scholarly communication tools, specifically designed to support cultural heritage centres, museums, patent offices, funders, facilities, and publishing infrastructures.
Core Architecture Components
Application Layer
- Application factory pattern with comprehensive extension integration
- Blueprint-based modular architecture with 20+ service-specific routes
- RESTful API design with JSON payload communication
- Swagger/OpenAPI documentation integration via Flasgger
Service Layer
- External service integration abstractions
- Background processing for batch operations
- Authentication and token management
- Error handling and fallback mechanisms
Data Layer
- SQLAlchemy ORM with PostgreSQL backend
- Complex relational model with cascading relationships
- Migration management via Alembic
- Reference data and controlled vocabularies
Key Architectural Patterns
The system implements a centralized approach to managing multiple persistent identifier types including DOIs, DOCiD™s, Handles, and other PIDs. This ensures consistency across all identifier operations and provides a unified interface for external systems.
Key Features:
- Unified identifier assignment and management
- Support for multiple identifier schemes simultaneously
- Automatic identifier validation and format checking
- Cross-reference resolution between different identifier types
Comments system implements a sophisticated tree structure allowing for threaded discussions with parent-child relationships, status management, and moderation capabilities.
Implementation Details:
- Self-referential database relationships
- Recursive comment retrieval algorithms
- Status-based visibility management
- Hierarchical permission inheritance
External service connectors are abstracted into dedicated service classes, providing consistent interfaces for interacting with Crossref, CSTR, CORDRA, ROR, ORCID, and Local Contexts APIs.
Service Patterns:
- Standardized authentication token management
- Consistent error handling and retry logic
- Unified data transformation interfaces
- Configurable service endpoints and credentials
Database Architecture
The system utilizes PostgreSQL as the primary database, leveraging advanced features like JSON fields, full-text search, and complex indexing strategies.
Core Data Models
Central entity managing publication metadata with DOCiD™/DOI assignment and cascading relationships to files, creators, organizations, funders, and projects.
class Publications(db.Model):
__tablename__ = 'publications'
id = Column(Integer, primary_key=True, autoincrement=True)
user_id = Column(Integer, ForeignKey('user_accounts.user_id'), nullable=False, index=True)
document_docid = Column(String(255), nullable=False)
document_title = Column(String(255), nullable=False)
document_description = Column(Text)
resource_type_id = Column(Integer, ForeignKey('resource_types.id'), nullable=False)
doi = Column(String(50), nullable=False)
published = Column(DateTime, default=datetime.utcnow)
# Relationships
user_account = relationship('UserAccount', back_populates='publications')
publication_creators = relationship('PublicationCreators', cascade="all, delete-orphan")
publication_organizations = relationship('PublicationOrganization', cascade="all, delete-orphan")
publication_funders = relationship('PublicationFunders', cascade="all, delete-orphan")
Comprehensive user management with social authentication support and rich profile information including ORCID and ROR ID integration.
class UserAccount(db.Model):
__tablename__ = "user_accounts"
user_id = db.Column(db.Integer, primary_key=True, autoincrement=True)
user_name = db.Column(db.String(50), nullable=False)
full_name = db.Column(db.String(100), nullable=False)
email = db.Column(db.String(100), unique=True, nullable=False)
orcid_id = db.Column(db.String(50), nullable=True)
ror_id = db.Column(db.String(50), nullable=True)
affiliation = db.Column(db.String(100), nullable=True)
role = db.Column(db.String(50), nullable=True)
country = db.Column(db.String(50), nullable=True)
# Social media profile links
linkedin_profile_link = db.Column(db.String(255), nullable=True)
github_profile_link = db.Column(db.String(255), nullable=True)
# Relationships
publications = relationship('Publications', back_populates='user_account')
Author/creator management with role assignment and multiple identifier scheme support.
class PublicationCreators(db.Model):
__tablename__ = 'publication_creators'
id = Column(Integer, primary_key=True, autoincrement=True)
publication_id = Column(Integer, ForeignKey('publications.id'), nullable=False, index=True)
family_name = Column(String(255), nullable=False)
given_name = Column(String(255))
identifier = Column(String(500)) # Full resolvable URL (e.g., https://orcid.org/...)
identifier_type = Column(String(50)) # Type (e.g., 'orcid', 'isni', 'viaf')
role_id = Column(String(255), nullable=False)
# Relationships
publication = relationship('Publications', back_populates='publication_creators')
Supported Creator Identifiers:
- ORCID: https://orcid.org/0000-0000-0000-0000
- ISNI: https://isni.org/isni/0000000000000000
- VIAF: http://viaf.org/viaf/123456789
- ResearcherID: http://www.researcherid.com/rid/A-1234-2012
- Scopus ID: Scopus author identifier URLs
Hierarchical comment structure with parent-child relationships, status management, and engagement tracking.
class PublicationComments(db.Model):
__tablename__ = 'publication_comments'
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
publication_id = db.Column(db.Integer, db.ForeignKey('publications.id'), nullable=False)
user_id = db.Column(db.Integer, db.ForeignKey('user_accounts.user_id'), nullable=False)
parent_comment_id = db.Column(db.Integer, db.ForeignKey('publication_comments.id'), nullable=True)
comment_text = db.Column(db.Text, nullable=False)
comment_type = db.Column(db.String(50), default='general')
status = db.Column(db.String(20), default='active')
likes_count = db.Column(db.Integer, default=0)
created_at = db.Column(db.DateTime, default=datetime.utcnow)
# Self-referential relationship for hierarchical comments
parent_comment = db.relationship('PublicationComments', remote_side=[id], backref='replies')
Comment Features:
- Hierarchical Structure: Parent-child relationships for threaded discussions
- Comment Types: general, review, question, suggestion
- Status Management: active, edited, deleted, flagged states
- Engagement: Like counts and reply threading
- Moderation: User vs admin permissions for editing/deletion
Authentication & Authorization
JWT-Based Authentication
- JWT-Extended implementation
- Access and refresh token management
- Role-based access control
- Social authentication integration
Social Authentication Providers
- Google OAuth 2.0
- ORCID OAuth 2.0
- GitHub OAuth 2.0
Permission Management
- User role hierarchy (user, admin, superadmin)
- Resource-based permissions
- Comment moderation permissions
- Publication ownership controls
Database Relationships
Relationship Type | Primary Entity | Related Entity | Description |
---|---|---|---|
One-to-Many | UserAccount | Publications | User ownership of publications |
One-to-Many | Publications | PublicationComments | Publication discussions |
One-to-Many | Publications | PublicationCreators | Multiple authors per publication |
One-to-Many | Publications | PublicationOrganization | Multiple affiliations |
One-to-Many | Publications | PublicationFunders | Multiple funding sources |
Self-Referential | PublicationComments | PublicationComments | Parent-child hierarchy for threaded comments |
Reference Data Models
DOCiD™ implements comprehensive controlled vocabularies for consistent metadata classification.
Model | Table | Purpose | Key Fields |
---|---|---|---|
ResourceTypes | resource_types | Publication resource classifications | id, resource_type |
CreatorsRoles | creators_roles | Creator role taxonomies | id, role_id, role_name |
FunderTypes | funder_types | Funding organization categories | id, funder_type_name |
PublicationTypes | publication_types | Document type classifications | id, publication_type_name |
PublicationIdentifierTypes | identifier_types | Identifier scheme registry | id, identifier_type_name |