photorec data carving


PhotoRec Data Carving - CGSecurity /**/ var skin = "monobook"; var stylepath = "/mw/skins"; var wgArticlePath = "/wiki/$1"; var wgScriptPath = "/mw"; var wgServer = "http://www.cgsecurity.org"; var wgCanonicalNamespace = ""; var wgNamespaceNumber = 0; var wgPageName = "PhotoRec_Data_Carving"; var wgTitle = "PhotoRec Data Carving"; var wgArticleId = 1353; var wgIsArticle = true; var wgUserName = null; var wgUserLanguage = "en"; var wgContentLanguage = "en"; /**/ PhotoRec Data Carving From CGSecurity Jump to: navigation, search Data carving is the process of extracting a collection of data from a larger data set. Data carving techniques frequently occur during a digital investigation when the unallocated file system space is analyzed to extract files. The files are "carved" from the unallocated space using file type-specific header and footer values. File system structures are not used during the process. This is exactly how PhotoRec works. Digital Forensics Research Workshop has issued a Data Carving challenge. The data set for this challenge is a 50MB raw file. It has no file system, but it contains JPEG, ZIP, HTML, Text, and Microsoft Office files and fragments. The goal is to extract as many full JPEG, ZIP, HTML, Text, and Office files as possible from it. Using this challenge as a test bed, PhotoRec has been improved to recover even more data than before. Everyone is welcome to contribute to the project. Contents 1 Data Recovery Process 1.1 PhotoRec 1.2 Manual recovery of remaining JPEG 1.3 Manual recovery of zip files 1.4 Manual recovery of XLS/Ole file 2 Disk Layout 3 Files 4 Conclusion if (window.showTocToggle) { var tocShowText = "show"; var tocHideText = "hide"; showTocToggle(); } Data Recovery Process PhotoRec The first step has been to use PhotoRec. Version 6.5-WIP (WIP=Work In Progress) is considered. PhotoRec has scanned the image file for known header and has successfully recognise all Jpeg, Ole/Office, HTML and ZIP headers. There is no false positive. JPEG footer is used to determine the file size and validity of recovered JPEG is checked by PhotoRec using libjpeg. ZIP footer are detected but the file integrity isn't checked. OLE file format is very complex, its internals are similar to a filesystem but PhotoRec is able to get the file size by analyzing the FAT. Text files are hard to detect because there is no header. After a UTF8 to ASCII translation, PhotoRec calculates the index of coincidence to determine if a sector holds text or random data. There can be false positive if a Doc or an HTML file isn't well detected (ie. fragmented data). Manual recovery of remaining JPEG PhotoRec can handle some form of data fragmentation in JPEG file, using libjpeg library, it's able to check recovered data. This way, it has been able to recover 9 JPEG perfectly. Manual recovery was initiated to recover the remaining files. Using dd and PhotoRec, additional files have been recovered. A picture of a hedgehog begins at sector 31475 and a picture from Mars begins at sector 31533. Extract from photorec.log: 31475-31532: jpg 31533-32836: jpg The second picture begins while the first isn't finished, both pictures are corrupted. Reading photorec log file, we can learn that the mars picture is corrupted after about 118784 bytes (JPG error at offset 118784). Let's try to find the exact data fragment size. $ dd if=dfrws-2006-challenge.raw of=mars1.jpg skip=31533 count=`expr 118784 / 512` 232+0 records in 232+0 records out $ display mars1.jpg display: Corrupt JPEG data: premature end of data segment `mars1.jpg'. display: Corrupt JPEG data: premature end of data segment `mars1.jpg'. The JPEG fragment is 232 sectors long but garbage can be seen at the end of the image, it means the fragment is too large. By trial and error, it's possible to determine that the fragment is 220 sectors long. $ dd if=dfrws-2006-challenge.raw of=mars2.jpg skip=31533 count=220 220+0 records in 220+0 records out $ display mars2.jpg display: Corrupt JPEG data: premature end of data segment `mars2.jpg'. display: Corrupt JPEG data: premature end of data segment `mars2.jpg'. There is no garbage left in the picture. 31475-31532: jpg fragment, hedgehog 31533-31752: jpg fragment, mars 31753-32836 ? $ dd if=dfrws-2006-challenge.raw skip=31475 count=`expr 31532 - 31475 + 1` > hedgehog.jpg 58+0 records in 58+0 records out $ dd if=dfrws-2006-challenge.raw skip=31753 count=`expr 32836 - 31753 + 1` >> hedgehog.jpg 1084+0 records in 1084+0 records out $ display hedgehog.jpg Now, the exact file size can be found using PhotoRec on the recovered picture. $ photorec hedgehog.jpg PhotoRec 6.4, Data Recovery Utility, June 2006 Christophe GRENIER <grenier@cgsecurity.org> http://www.cgsecurity.org Please wait... Disk hedgehog.jpg - 584 KB / 571 KiB - CHS 1 255 63 (RO), sector size=512 PhotoRec exited normally. $ ls -l recup_dir.1/f0.jpg -rw-rw-r-- 1 kmaster kmaster 98354 Jul 10 11:30 recup_dir.1/f0.jpg $ md5sum recup_dir.1/f0.jpg db89684c177168036e274140ecf766a1 recup_dir.1/f0.jpg The picture size is 98354 (193 sectors). We can now recover the mars picture. $ expr 31753 + 193 - 58 31888 $ dd if=dfrws-2006-challenge.raw skip=31533 count=220 > mars3.jpg 220+0 records in 220+0 records out $ dd if=dfrws-2006-challenge.raw skip=31888 count=`expr 32836 - 31888 + 1` >> mars3.jpg As seen before, it's possible to get the exact file size: $ photorec mars3.jpg PhotoRec 6.4, Data Recovery Utility, June 2006 Christophe GRENIER <grenier@cgsecurity.org> http://www.cgsecurity.org Please wait... Disk mars3.jpg - 598 KB / 584 KiB - CHS 1 255 63 (RO), sector size=512 PhotoRec exited normally. $ ls -l recup_dir.2/ total 192 -rw-rw-r-- 1 kmaster kmaster 188693 Jul 10 11:47 f0.jpg $ md5sum recup_dir.2/f0.jpg 0915313e99af0f6bf13bc06bcd003113 recup_dir.2/f0.jpg Manual recovery of zip files Three zip files are recovered by PhotoRec but one of them is corrupted. A little perl script was used to fix the zip file beginning at sector 45015 found by PhotoRec. Using unzip, this little perl script locates and removes the extra sectors presents in the file. Manual recovery of XLS/Ole file Office document including Excel are using the Ole file format. A document has been identified at sector 2051 but this document hasn't been successfully recovered by PhotoRec, the file may be fragmented. A OLE file consists of a header structure and a list of all sectors following the header. In our case, the size of the sectors is 512 bytes. The SID (Sector Identifier) of first sector of the directory stream is 1688. The master sector allocation table is using 14 sectors: 1673-1685,1689. The directory stream, SID 3761, lists the following components Workbook SID 0 size 848333 SummaryInformation SID 1657 size 4096 DocumentSummaryInformation SID 1690 size 4096 Sectors Object SID 2051 Header N/A 2052-?,?-3729 Workbook 0-1656 x-x+20 21 extra sectors, not XLS N/A 3730-3737 SummaryInformation 1657-1664 3746-3758,3762 Allocation Table 1673-1685,1689 3761 RootDirectory 1688 3763-3770 DocumentSummaryInformation 1690-1697 Unfortunatly, I have failed to locate the 21 extra sectors. Anyway, latest version of OpenOffice has been able to open the corrupted file and display most data. A new version of the document can be found on fcc web site. Disk Layout Sectors File type Note 0-8 HTML (fragment) Alice in Wonderland by Lewis Carroll 9-44 HTML Alice in Wonderland by Lewis Carroll 2051 Office (fragment) Excel, http://www.fcc.gov/Forms/Form477/477.xls 3868-4428 JPG 640x481 Mars 4436-4455 HTML (end is missing) A STUDY IN SCARLET by Sir Arthur Conan Doyle 4456-4501 HTML Stave 1: Marley's Ghost by Charles Dickens 4502-4556 HTML (beginning is missing) Stave 1: Marley's Ghost by Charles Dickens 7964-8284 Office Upcoming Research Symposium 1/3 8285-9473 JPG http://www.dfrws.org/2004/photos/day2/rodeo1-3-dfrws2004.jpg 9474 Office Upcoming Research Symposium 2/3 10031 Office Upcoming Research Symposium 3/3 11619-11822 JPG yeast 1/2 11823-11848 Text Moby Dick, Chapter i - LOOMINGS (page 1-6) 11849-12017 JPG yeast 2/2 12222-26116 JPG DFRWS 2006 Forensics Challenge, 11598x11598 27496-27606 HTML The Comedy of Errors by Shakespeare, Act I, Scene I (1/2) 27607-27977 JPG The porcupine 27978-28196 HTML The Comedy of Errors by Shakespeare (2/2) 28244-28245 HTML Moby Dick - chapter 134 (1/2) 28246-28306 Text (fragment) De la division du travail social, Emile Durkheim 28307-28344 HTML Moby Dick page - chapter 135 (2/2) 28439-28726 ZIP Zip Ok 28729-29528 ZIP ZIP 1/2 29529-29895 HTML The Tempest, Shakespeare 29896-31368 ZIP ZIP 2/2 31475-31532 JPG A hedgehog (1/2) 31533-31752 JPG Mars (1/2) 31753-31887 JPG A hedgehog (2/2) 31888-32036 JPG Mars (2/2) 32837-33397 Office http://www.tsa.gov/public/interweb/assetlibrary/Permitted_Prohibited_Facts.doc 34288-34306 Office "Reports on Computer Systems Technology" http://csrc.nist.gov/publications/nistpubs/800-26/sp800-26.doc 1/2 34307-34412 Text The Adventure of the Copper Beeches 34413-36236 Office http://csrc.nist.gov/publications/nistpubs/800-26/sp800-26.doc 2/2 36292-36640 JPG  ? 36998-37649 Office PREVENTING CRIME: WHAT WORKS, WHAT DOESN'T, WHAT'S PROMISING http://www.ncjrs.org/docfiles/wholedoc.doc 1/3 37727-39427 Office http://www.ncjrs.org/docfiles/wholedoc.doc 2/3 39477-40380 Office http://www.ncjrs.org/docfiles/wholedoc.doc 3/3 40638-41219 JPG http://www.dfrws.org/2004/photos/day2/rodeo1-breaf-dfrws2004.jpg 1/2 41239-41609 JPG http://www.dfrws.org/2004/photos/day2/rodeo1-breaf-dfrws2004.jpg 2/2 41611-43433 JPG http://imgsrc.hubblesite.org/hu/db/2006/10/images/a/formats/1280_wallpaper.jpg (1/2) 43434-44028 JPG http://www.dfrws.org/2004/photos/day2/rodeo1-dfrws2004.jpg 44029-44200 JPG http://imgsrc.hubblesite.org/hu/db/2006/10/images/a/formats/1280_wallpaper.jpg (2/2) 45015-45386 ZIP Zip 1/2 45390-45545 ZIP Zip 2/2 45566-45963 JPG U. S. Geological Survey Open-File Report 01-154 Slope off Florida Keys [hotos/1565.jpg ttp://pubs.usgs.gov/of/2001/of01-154/data/bphotos/1565.jpg 1/2 45964-46103 Office Farm Credit System Insurance Corporation Statement of Financial Condition March 31, 2006 and December 31, 2005 http://www.fcsic.gov/documents/3-31-2006%20Financial%20Statement.doc 46104-46826 JPG http://pubs.usgs.gov/of/2001/of01-154/data/bphotos/1565.jpg 2/2 46910-94836 JPG DFRWS 2006 Forensics Challenge, 8640x8640 94846-95628 JPG Saturn http://imgsrc.hubblesite.org/hu/db/2001/15/images/a/formats/full_jpg.jpg (1/2) 95630-96653 JPG Saturn http://imgsrc.hubblesite.org/hu/db/2001/15/images/a/formats/full_jpg.jpg (2/2) Files File type File size (in bytes) MD5 hash Sectors Note PhotoRec Score HTML (fragment) 4608 ec89111e45da8265b641655d0f68725e 0-8 Alice in Wonderland by Lewis Carroll 5 HTML 18147 eec87931b03e5a4a4ef8fd51109a1227 9-44 Alice in Wonderland by Lewis Carroll 5 Office 869888  ? ~ 2051-3770 (21 extra sectors) http://www.fcc.gov/Forms/Form477/477.xls 1 JPG 287186 daf4205574abd6919b10ca8be92d17a3 3868-4428 640x481 Mars 5 HTML (end is missing) 10240 799ad2d2f2f1f17657338d98c97559c4 4436-4455 A STUDY IN SCARLET by Sir Arthur Conan Doyle 5 HTML 23544 f4481ed348d3d59c5dad80afeb0341f9 4456-4501 Stave 1: Marley's Ghost by Charles Dickens 5 HTML (beginning is missing) 27875 baf8b811ee9502408f9f0e73efa77cf0 4502-4556 Stave 1: Marley's Ghost by Charles Dickens 5 Office 450048 8d2a9a284e078805ada47db191f35244 7964-8284, 9474-10031 Upcoming Research Symposium 5 JPG 608703 4efc6c572683878efd8f3404ddaded7b 8285-9473 http://www.dfrws.org/2004/photos/day2/rodeo1-3-dfrws2004.jpg 5 JPG 190720 7b07320709e0caa947663f5df3a0a390 11619-11822, 11849-12017 yeast 5 Text 12826 f800a46e18fafd309825c5ee84a654a2 11823-11848 Moby Dick, Chapter i - LOOMINGS (page 1-6) 3 JPG 7113968 b070beae1606f67a342bc5f78c29c743 12222-26116 DFRWS 2006 Forensics Challenge, 11598x11598 5 HTML 168525 1959aa0391664b60fd0f2e64ed7a22f4 27496-27606, 27978-28196 The Comedy of Errors by Shakespeare, Act I, Scene I 2 JPG 189534 fe7e7ac67709f2d9c2483aa98c681b99 27607-27977 The porcupine 5 HTML 20019 045798407b927321326a547704e67831 28244-28245, 28307-28344 Moby Dick - chapter 134 and 135 2 Text (fragment) 30816 616a6bbe915c3dbf51014fd76f55b0e3 28246-28306 De la division du travail social, Emile Durkheim 0 ZIP 147150 ebabde39ba44d38888dd82606980498a 28439-28726 Zip Ok 5 ZIP 1163745 9a4c2d3a9bd203eb39c9f954a3c997e4 28729-29528, 29896-31368 ZIP 5 HTML 187793 158496c522d97b7389c9907cae777ac1 29529-29895 The Tempest, Shakespeare 5 JPG 98354 db89684c177168036e274140ecf766a1 31475-31532, 31753-31887 A hedgehog 2 JPG 188693 0915313e99af0f6bf13bc06bcd003113 31533-31752, 31888-32036 Mars 2 Office 287232 0e52e75029e99cd2e9dcd0af271cf4a2 32837-33397 http://www.tsa.gov/public/interweb/assetlibrary/Permitted_Prohibited_Facts.doc 5 Office 943616 d7ff92b8cc1c89c46a78288b9c673152 34288-34306, 34413-36236 http://csrc.nist.gov/publications/nistpubs/800-26/sp800-26.doc 2 Text 53870 5a12ef9dba88a186ef18a5d349b28e37 34307-34412 The Adventure of the Copper Beeches 3 JPG 178659 2fae8770cc013d22e9ea1c070f2f509b 36292-36640  ? 5 Office 1667584 4a22f04b097920d11fff4e192e0667a4 36998-37649, 37727-39427, 39477-40380 PREVENTING CRIME: WHAT WORKS, WHAT DOESN'T, WHAT'S PROMISING http://www.ncjrs.org/docfiles/wholedoc.doc 2 JPG 487473 f8c51e0688796b5d616f0e5d4a94d104 40638-41219, 41239-41609 http://www.dfrws.org/2004/photos/day2/rodeo1-breaf-dfrws2004.jpg 2 JPG 1021085 7cce072e518fd72484c97adb1b4be08e 41611-43433, 44029-44200 http://imgsrc.hubblesite.org/hu/db/2006/10/images/a/formats/1280_wallpaper.jpg 5 JPG 304413 c0da37b3f1a07af790e6e9171cedc4d2 43434-44028 http://www.dfrws.org/2004/photos/day2/rodeo1-dfrws2004.jpg 5 ZIP 270181 f940fcc37c82e8ff1431e5c3c061611e 45015-45386, 45390-45545 Zip 2 JPG 573499 2320fe9c41eaddb864a56c2ddc4dd186 45566-45963, 46104-46826 U. S. Geological Survey Open-File Report 01-154 Slope off Florida Keys [hotos/1565.jpg http://pubs.usgs.gov/of/2001/of01-154/data/bphotos/1565.jpg 5 Office 71680 109284cc5abddc83879a29785795fd75 45964-46103 Farm Credit System Insurance Corporation Statement of Financial Condition March 31, 2006 and December 31, 2005 http://www.fcsic.gov/documents/3-31-2006%20Financial%20Statement.doc 5 JPG 24538540 db32b271506b2f4974791957627c61cc 46910-94836 DFRWS 2006 Forensics Challenge, 8640x8640 5 JPG 924877 1a5a843000ef617af93a9cad645e3cdf 94846-95628, 95630-96653 Saturn http://imgsrc.hubblesite.org/hu/db/2001/15/images/a/formats/full_jpg.jpg 1 PhotoRec Score Legend: 0 File not found 1 First sector identified 2 + correct file type 3 + all sectors identified 4 + correct file size 5 + correct checksum Conclusion PhotoRec has been able to retrieve most files automatically. Results can still be improved by brute forcing JPG fragment location or adding some JPG search-only phase but this can be time-consuming. Thanks to Daniel Sedory for letting me know about this contest and his long time involvement in TestDisk/PhotoRec project the following ESIEA students: Gregory BLANC, Fabien BOUFFARD, Hicham CHAARAOUI, Karim EL FILALI, Amine HASSANI, Igor VALLEE for their work on OLE file format. Christophe GRENIER Category: Data Recovery if (window.isMSIE55) fixalpha(); Data Recovery TestDisk PhotoRec download This page was last modified 20:33, 22 October 2006. Content is available under GNU Free Documentation License 1.2. if (window.runOnloadHook) runOnloadHook();

Wyszukiwarka

Podobne podstrony:
photorec
nach?m gebrauch von photorec
wiederherstellbare?teiformate unter photorec
photoresistor
after using photorec
TestDisk & PhotoRec
file formats recovered by photorec
digitale foto wiederherstellung mit photorec
photorec?
photorec fr
digital photos recovery using photorec

więcej podobnych podstron