Difference between revisions of "CDF"

From Game Research Wiki
Jump to navigation Jump to search
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:Archives]]
[[Category:Archives]]
Used in the following games:
Used in the following games:
* Parasite Eve 2 (PSX)
* [[Parasite Eve II]] (PSX)


Parasite Eve 2 data archive format. PlayStation games are tricky to reverse since the game may not use the file's positions to access data but rather the physical disks sectors. Which means the original disk image would be needed to extract data properly. :T
Note: Formats are aligned to sectors. Any change to files may require full rebuild of whole image. This is typical of PS1 games.


There may be a pattern for how files are stored and accessed. At the start of each sector, after the CD-XA header, if the sector is the start of a new file, there will be 16-bytes in front of it first. First 4-bytes with a random number. Then a single byte which represents how many sectors this file will occupy. The rest is zeroed out. With this, I was able to write a program that would go sector-by-sector looking for this header than matched. I was able to extract 1,049 background images in full.


Notes:
Comparing all STAGE*.CDF files...
Location: 8224
{| class="wikitable"
{| class="wikitable"
!colspan="15"|Disk 1 (SLUS-01042)
!colspan="15"|File Formats
|-
! LBA !! Name !! Description
|-
| 0000023 || SYSTEM.CNF ||
|-
| 0000024 || PE_DISK.01 || Unknown if used. Size of sector?
|-
| 0000025 || SLUS_010.42 || Program
|-
| 0000204 || INIT.BS || "Published by..." image.
|-
| 0000210 || STAGE0.HED || Header? Possibly some LBA locations
|-
|-
| 0000214 || STAGE0.CDF || Data archive?
! Value !! Temp !! Occurrences (Disk1) !! Occurrences (Disk2)
|-
|-
| 0030255 || STAGE1.CDF || Data archive. Has soundless STR
| 05FF0008 || BS Format (Backgrounds) || ? || ?
|-
|-
| 0075564 || STAGE2.CDF || Data archive Has soundless STR
| 07FF0008 || String (1byte)? || ? || ?
|-
|-
| 0100127 || STAGE3.CDF || Data archive
| 02010008 || Unknown || ? || ?
|-
|-
| 0117576 || INTER0.STR || All FMV with sound.
| 01010008 || Unknown || ? || ?
|-
|-
| 0225526 || DUMMY.DMY || Dummy file that is actually a video not related to the game.
| 06010008 || Unknown || ? || ?
|}
|}


Note on INTER0.STR
Temp: 2018/03/05 From what I have gathered, CDF does not have any type of index in itself however each entry has an identifier at the start of the data portion of the sector that is 16 bytes in length. This contains a unique identifier that is used to determine the content type followed by at least 1 byte that indicates how many frames the data occupies. This goes along with my original findings long ago but now it seems more clear.
Due to the file size, I suspect that this may contain all the FMV data on the disc. However, it was very common for PSX games to access the disc directly rather than access the files via the file system. This makes extracting data more complicated, at least for FMV/STR/Videos. Other files may be the same as well since due to the tech limits of the hardware, developers had to be creative with their engines.
The only issue may be certain sectors that are in a different mode so the data portion of the sector may be a different length. Scanning a raw dump of the disc might work out better as certain data types such as FMV or audio
may use this other CD-XA mode so the structure of the sector is different. My tool will need to be able to handle both paths. One where you scan the CDF directy but may have to skip any file types that require sectors with data greater than 2048 and one that can scan a bin file and handle those special cases. I still need to find an index for the game. The HED file and PE_DISK.01 and .02 are byte-for-byte identical on each disc.  


Due to the way the game works, it may not make sense to have a page just for one format but rather the whole game. PSX games are a pain to deal with since the formats are, at times, tied to sectors.
TEMP: 2023/1/24
Some additional notes before a re-write. The structure of the index for these are not based on the LBA of the disc but offsets are based location inside the file itself. STAGE0 and it's separate header index entries start from 0 for example. Others have the header embedded at the start of the file. There is more than just normal indexes as there may be folders and then a separate list of files that may be listed as streaming. This requires more looking into but could be due to streaming may be video related that may use a different reading mode and sector size.




{| class="wikitable"
== See Also ==
!colspan="15"|Disk 1 (SLUS-01042)
[[HED]]
|-
! LBA Range !! Header !! Description
|-
| 498-510 || 05FF00080D0000000000000000000000 ||[[File:Pe2 loadscreen01.jpg|80px|thumbnail|center]]
|-
| 518-530 || 05FF00080D0000000000000000000000 ||
|-
| 538-551 || 05FF00080E0000000000000000000000 || Loading Screen
|-
| 559-572 || 05FF00080E0000000000000000000000 || Loading Screen
|-
| 580-592 || 05FF00080D0000000000000000000000 || Loading Screen
|}

Latest revision as of 02:48, 25 January 2023

Used in the following games:

Note: Formats are aligned to sectors. Any change to files may require full rebuild of whole image. This is typical of PS1 games.

There may be a pattern for how files are stored and accessed. At the start of each sector, after the CD-XA header, if the sector is the start of a new file, there will be 16-bytes in front of it first. First 4-bytes with a random number. Then a single byte which represents how many sectors this file will occupy. The rest is zeroed out. With this, I was able to write a program that would go sector-by-sector looking for this header than matched. I was able to extract 1,049 background images in full.

File Formats
Value Temp Occurrences (Disk1) Occurrences (Disk2)
05FF0008 BS Format (Backgrounds) ? ?
07FF0008 String (1byte)? ? ?
02010008 Unknown ? ?
01010008 Unknown ? ?
06010008 Unknown ? ?

Temp: 2018/03/05 From what I have gathered, CDF does not have any type of index in itself however each entry has an identifier at the start of the data portion of the sector that is 16 bytes in length. This contains a unique identifier that is used to determine the content type followed by at least 1 byte that indicates how many frames the data occupies. This goes along with my original findings long ago but now it seems more clear. The only issue may be certain sectors that are in a different mode so the data portion of the sector may be a different length. Scanning a raw dump of the disc might work out better as certain data types such as FMV or audio may use this other CD-XA mode so the structure of the sector is different. My tool will need to be able to handle both paths. One where you scan the CDF directy but may have to skip any file types that require sectors with data greater than 2048 and one that can scan a bin file and handle those special cases. I still need to find an index for the game. The HED file and PE_DISK.01 and .02 are byte-for-byte identical on each disc.

TEMP: 2023/1/24 Some additional notes before a re-write. The structure of the index for these are not based on the LBA of the disc but offsets are based location inside the file itself. STAGE0 and it's separate header index entries start from 0 for example. Others have the header embedded at the start of the file. There is more than just normal indexes as there may be folders and then a separate list of files that may be listed as streaming. This requires more looking into but could be due to streaming may be video related that may use a different reading mode and sector size.


See Also

HED