Cath Domall File (CDF)

Version 2.0

The CATH Domall file describes domain boundaries for entries in the CATH database. All PDB chains in CATH that contain 1 or more domains have a CathDomall entry. Whole chain domains can be identified where the number of domains is 1 and the number of fragments is 0.

  • Comment lines start with a '#' character
  • Segments are continuous sequence regions of domains
  • Fragments are small regions of the protein chain that are excluded from the domain definition
Column Description
1 Chain name (5 characters)
2 Number of domains (formatted 'D%02d')
3 Number of fragments (formatted 'F%02d')

The formatting of a Cath Domall file is best explained using examples.

Example Domall Entries

KEY:
N  = Number of segments
C  = Chain character
I  = Insert character/code ('-' indicates no insert character)
S  = Start PDB number
E  = End PDB number
NR = number of residues (fragment information only)

Example 1

1chmA  D02 F00  1  A	2 - A  156 -  1  A  157 - A  402 -
		N |C	S I C	 E I| N |C    S I C    E I|
               |<----Domain One---->|<-----Domain Two---->|
                  |<--Segment One-->|   |<--Segment One-->|

This translates to:

Domain Chain Start/Stop
1chmA01 A 2-156
1chmA02 A 157-402

Example 2

1cnsA  D02 F00  2  A    1 - A   87 -  A  146 - A  243 -  1  A   88 - A  145 -

                N |C    S I C    E I| C    S I C    E I| N |C    S I C    E I|
               |<--------------Domain One------------->|<-----Domain Two---->|
                  |<--Segment One-->|<---Segment Two-->|   |<--Segment One-->|

This translates to:

Domain Chain Start/Stop
1cnsA01 A 1-87, 146-243
1cnsA02 A 88-145

Fragment Information

Fragments are small regions of the protein chain that are not included in the domain definition. These residue ranges are tagged on the end of the segment information. The format is different from the segment range information.

Example 3

1amg 0 D02 F01  1  0	1 - 0  360 -  1  0  362 - 0  417 -  0  361 - 0  361 - (1)
		N |C	S I C	 E I| N |C    S I C    E I| C	 S I C    E I  NR|
               |<----Domain One---->|<-----Domain Two---->|<---Fragment One----->|
                  |<--Segment One-->|   |<--Segment One-->|

This translates to:

Domain Chain Start/Stop
1amg001 A 1-360
1amg002 A 362-417

Fragment = 361

Example 4

1bcmA  D02 F02  1  A  257 - A  487 -  1  A  492 - A  559 -  A  488 - A  491 - (4)  A  560 - A  560 - (1)
		N |C	S I C	 E I| N |C    S I C    E I| C	 S I C    E I  NR| C	S I C	 E I  NR|
               |<----Domain One---->|<-----Domain Two---->|<---Fragment One----->|<---Fragment Two----->|
                  |<--Segment One-->|   |<--Segment One-->|

This translates to:

Domain Chain Start/Stop
1bcmA01 A 257-487
1bcmA02 A 492-559

Fragments = 488-491, 560

CATH-Gene3D is a Global Biodata Core Resource Learn more...