Data Recovery

on a Linux system


When you look up 'data destruction' on the WWW, chances are you find lots of links : how to salvage a crashed hard disk, how to recover erased or lost data, how to get your data back when you formatted your hard disk by accident ... However, most of these are from companies that offer such recovery services, or sell software to try it yourself. While browsing a Linux support forum, I noticed that still a lot of newbies managed to loose files either by deleting or messing up partition tables, formatting partitions, or simply delete files they did not really want deleted. Some desktop environments offer some sort of "Trash bin" were deleted files go. Samba has a 'recycle bin' option to keep "deleted" files in a hidden directory. Nonetheless, Unix expects that if you delete a file, you want it to be gone, not just moved to a hidden directory. That makes sense - but for all those occasions were you wish you could undo the delete operation, I decided to have a look at undelete features for Linux.

This is merely a primary investigation and by no means complete. There's also no guarantee that you will always be able to recover lost data. Some operations described here can potentially cause more data loss. And no undelete feature should ever be considered a replacement for backups or be part of a recovery plan. It's more of a last resort for those "I hope I'll never need this but you never know" cases.

On the other hand, if you really want to delete a file and make sure noone will be able to recover it, look into data destruction methods.


There are several ways you can 'loose' data, and the recovery method will depend on how you lost the data in the first place. We will look at the following cases :

  1. deleted 1 or more files and/or directories
  2. filesystem corrupted
  3. deleted a partition
  4. reformatted a partition
  5. created new partitions and and lost a few partitions in the process
  6. all partitions have disappeared
  7. computer won't boot

When you see your files disappear, you'll probably go "what have I done !?". That's a good reaction. Do it again : think about what you have done : did you execute a rm or del command ? were you repartitioning a disk ? were you planning to format a partition ? ... If you choose the wrong recovery approach, e.g. try to recover deleted files (case 1) by trying to 'repair a partition table' (case 5 or 6), you'll probably make things worse.

The following is an investigation into some do-it-yourself recovery scenario's, for direct attached disks. Complex disk systems, such as RAIDs and Logical Volumes (spanning disks), are a different story and are not considered here (although some of the solutions discussed further may also apply to those storage systems). Note also that we'll assume your operating system is Linux, although we will also look at ntfs partitions. For those scenarios where you need to recover a system that won't boot or where partition tables and filesystems have disappeared, we'll assume you'll use a Linux Live CD, a bootable CD that contains a Linux system that runs entirely of the CD (and RAM drives). This not only lets you boot, but enables you to approach unmounted disks on the patient system and run tools that bypass the filesystem. In some (file recovery)cases, you'll also need a writable medium (additional hard disk, usb storage, ...) where you can save recovered files - you do not want to write them to the disk you're trying to recover.


deleted 1 or more files and/or directories

On a test system, I've copied some files out of /etc to /srv/data1 (partition with ext2 filesystem) and deleted a file /srv/data1/testfileshadow, a copy of /etc/shadow. Let's see if we can get it back.

When you need to attempt an undelete operation, always go in "single user' mode, to prevent data from other users overwriting the file(s) you need to recover.

Old School Unix : just 'grep' it

Use a grep syntax such as

	grep -a -b 'search-text' /dev/partition > path/to/outputfile
	## OR
	grep -a -B[lines before] -A[lines after] 'search-text' /dev/partition > path/to/outputfile

	# -i : Ignore case in both the PATTERN and the input files
	# -a : Process a binary file as if it were text
	# -b : show offset of 'search-text' in inputfile where match is found
	# -B Print number lines/size of leading context before matching lines.
    	# -A Print number lines/size of trailing context after matching lines.
	

looking for the shadow file, which most likely contains the string "root:". Note that you grep in a partition, not a directory.

	~:# grep -b "root:" /dev/sdb1
	/debian:/srv# grep -a -b "root:" 
	4756605:root:*::
	4979728:root:x:0:
	5005319:root:$1$X291ChRT$rJ82K4WcE1u2B65oHj55m1:13881:0:99999:7:::
	5010672:root:$1$X291ChRT$rJ82K4WcE1u2B65oHj55m1:13881:0:99999:7:::

	~:# grep -a -B1 -A5 "root:" /dev/sdb1
	root:$1$X291ChRT$rJ82K4WcE1u2B65oHj55m1:13881:0:99999:7:::
	daemon:*:13881:0:99999:7:::
	bin:*:13881:0:99999:7:::
	sys:*:13881:0:99999:7:::
	sync:*:13881:0:99999:7:::
	games:*:13881:0:99999:7:::
	

At some point, deleted files (and other files which matched the grep search) showed up in /srv/data1/lost+found. I don't know how to reproduce that.

undelete on an ext2 filesystem

The following (Debian) packages contain tools for undelete operations on an ext2 filesystem :

most tools will want or prefer to operate on unmounted partitions. If you need to undelete from a system disk, you'll have to attach the disk to a 2nd computer where these tools are installed. You will also need additional storage space to save recovered files or partitions : you don't want the recovered files to be written over any other data you're still trying to recover.

e2undel

see also Ext2fs Undeletion Howto.

read the man page for e2undel, especially the DESCRIPTION section.
e2undel searches all inodes marked as deleted on a file system and lists them assorted by owner and time of deletion. Additionally, it gives you the file size and tries to determine the file type If you did not just delete a whole bunch of files with a 'rm -r *', this information should be helpful to find out which of the deleted files you would like to recover.

e2undel does not actually undelete a file (i.e., does not manipulate ext2 internal structures like inode, block bitmap, and inode bitmap). Instead it recovers the data of a deleted file and saves it in a new file.

 
	~# e2undel -a -d /dev/sdb1 -s /srv/recovery/2
	
	[interactive session to determine what you're looking for]

	  inode     size  deleted at        name
	-----------------------------------------------------------
	     13      717  Mar 22 14:42 2008 (not in log file)
	     14     4696  Mar 22 14:42 2008 (not in log file)
	     15     2944  Mar 22 14:42 2008 (not in log file)
	     16      724  Mar 22 14:42 2008 (not in log file)
	     17       44  Mar 22 14:42 2008 (not in log file)


	Select an inode listed above or press enter to go back: 13
	717 bytes written to /srv/recovery/2/inode-13

	Select an inode listed above or press enter to go back:
	...
	

e2undel can be setup to enable "undelete file by filename" but it requires e2undel be setup in advance and with a specific config. see SETUP section in man e2undel.

Undeleting files on the Linux ext2 filesysten with debugfs and e2undel

more tools : e2fsprogs

Why this does not work with ext3

In general, ext2 and ext3 are compatible file systems: You can mount an ext3 fs as ext2 and even use the ext2 low level utilities like debugfs. However, ext3 behaves in a different manner in one crucial point: If a file is deleted, its inode data are removed, too. Especially, the list of data blocks is lost; so it is not possible to recover any deleted file.
HOWTO recover deleted files on an ext3 file system

recover

'recover' works pretty much the same way ad e2undel : you 'find' deleted file by owner id, period of deletion, filesize, ... . recover also lets you specify 'text that the file should contain', and most important, batch-processes all inodes and dumps the contents of those recovered file in a directory of your choice. Simply run 'recover' and answer the questions

	~:# recover
	[interactive session]

	~:# ls  /srv/recovery/3
	dump111  dump14  dump22  dump29  dump37  dump44  dump51  dump58  dump66
	dump112  dump15  dump23  dump30  dump38  dump45  dump52  dump59  dump67
	...	 ...
	

the filename is lost, but the contents is recovered

recover howto

magicresue

magicrescue recovers deleted files of a given file type, by looking at the magic numbers that denote the file type. Useful if you know what type of file you're looking for. It's interactive, so you can just run it and answer the questions. You can also run it in batch mode, with commandline options to specify how you want magicrescue to operate.See also man magicrescue

Works on any filesystem. Magicrescue scans the drive and looks for data, so it can also be used to recover content of a corrupted disk or a lost partition. For filetypes that don't magicresue doesn't know, you can create your own 'recipe'.

photorec

Similar to magicrescue and recover, photorec scans partitions and finds all data, and copies the data to a new file. Originally intended to recover lost graphic files (jpg, ...)it has been extended to find other file types as well, and has a config file to specify file types to look for. It's a batch process, and the results dumped in a directory of choice.

	~:# photorec
	[interactive session]

	/srv/recovery/4/recup_dir.1# ls
	f114688.txt  f147472.txt  f163880.txt  f180256.txt  
	f114696.txt  f147480.txt  f163888.txt  f180272.txt 
	...
	

photorec is included in the debian package 'testdisk'. PhotoRec home page and documentation.

Recover text from a lost disk or partition

As with the old school "grep", you can use the 'strings' program (debian package : binutils) to find text on a disk, even if there's no more partition table or filesystem. 'strings' is usually used to find ascii text in binary files (eg to extract comments or values of char constants and variables in binary files such as compiled programs), but you can also use it to read raw data of a disk and look for text. This can be useful to at least save some data before you attempt more intrusive and possibly destructive recovery methods, or to just look for passwords or other interesting stuff on a re-formatted disk or in deleted files.

	# example : reading text of a raw disk
	strings -a /dev/sdb

	# example : reading text of a partition, bypassing the filesystem
	strings -a /dev/sdb2
	

When you redirect the output to a file, you'll end up with one file containing all found text : strings reads the bytes directly of the hard drive and is not aware of file names or end-of-file markers

Repair a filesystem (ext2, ext3)

filesystem errors can be fixed with fsck. fsck is a wrapper, it will invoke a suitable ext2 or ext3 tool to check for and fix filesystem errors. Most Linux systems will by default execute fsck after a given number of system boots, or if a filesystem appears 'unclean' during boot.

Repairs need to be done on a read-only filesystem. For trouble detected during boot, the system will automatically reboot with its / filesystem mounted read-only for fsck to do its work.

The 'root' of a filesystem is the "superblock', a special inode where all other inodes (and thus files) depend upon. If this block goes bad, the filesystem is unusable. This can be fixed by using one of the spare superblocks. see man e2fsck, option -b (superblock)

Salvage data from a broken filesystem or a damaged disk

If the filesystem appears unrepairable, you can salvage data from it by bypassing the filesystem and reading the disk, eg with dd, or, better yet, ddrescue.

ddrescue works like dd, but will attempt to rescue data in case of read errors. ddrescue does not truncate the output file if not asked to. So, every time you run it on the same output file, it tries to fill in the gaps. If you have two or more damaged copies of a file, cdrom, disk, etc and run ddrescue on all of them, one at a time, with the same output file, you will probably obtain a complete and error-free file.

Because dd and ddrescue read the entire disk, including boot sectors and partition tables, you're actually cloning the disk so the output will actually have partitions and a filesystem, rather than a dump of nameless files.

recover a deleted a partition

Expect the unexpected, : If you have to re-partition to recover a lost partition, back up all important data just in case. This may include a backup of your system configuration.

Plan ahead : to be able to re-create the original partition scheme, it helps to have it documented : partitions, filesystem, partition size, start and end, mount points, .... Consider creating images of the disk or the partitions for easy restore. See also the section on disaster recovery with mondo (further down).

To document your current partition layout, you can use

These tools return partition start and end cylinders or sectors, sizes in blocks, etc. If you keep the output of 'sfdisk -l' or 'parted DEVICE print', you can use these outputfiles with sfdisk or parted later to recreate a lost partition table very precisely. Your chances of recovering a lost partition (and the files in it) are quite good, but decrease if the disk space you want to recover was used in the mean time to store other files.

test case : recovering a lost partition

Case : you woke up this morning, and your /home partition is gone. the /home directory was mounted on, say, /dev/sdb1, but apparently you lost that partition; /home is still there (in / ) but it's empty.

you investigate, and yes indeed, /dev/sdb1 no longer exists.

Disk /dev/sdb: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb2   *         231         261      249007+  83  Linux
	

So far, nothing is lost. Probably all that happened is that an entry in the partition table got deleted, but you haven't altered the actual bytes on the disk. All you need to do is create a partition table record for /dev/sdb1

You can kinda guess that, as fdisk tells you that /dev/sdb2 starts at cylinder 231, probably /dev/sdb1 started at cylinder 1 and ended at cylinder 230, and was a primary partition of type "83 : Linux". Of course, if you'd had documented your partition layout, you'd know for sure. Anyway, you relax, then try to re-create /dev/sdb1 :

	debian:~# fdisk /dev/sdb

	Command (m for help): n
	Command action
	   e   extended
	   p   primary partition (1-4)
	p
	Partition number (1-4): 1
	First cylinder (1-261, default 1): 1
	Last cylinder or +size or +sizeM or +sizeK (1-230, default 230): 230

	Command (m for help): w
	The partition table has been altered!

	Calling ioctl() to re-read partition table.

	WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
	The kernel still uses the old table.
	The new table will be used at the next reboot.
	Syncing disks.
	debian:~#
	

At this point, you reboot. The operating system will use the newly created partition table. Hopefully, the newly added partition will contain the old partition's filesystem (so don't format !), and all will be well : your original /home directory with all its subdirectories and files will magically reappear.

If the filesystem does not reappear : try to recover data (undelete tools).

alternatives to using fdisk

Parted
has a rescue option to re-create a lost partition (fix the partition table). You need to have some idea about the partition size and/or start/end sectors.
TestDisk
testdisk is a partition recovery tool that is capable of finding deleted partitions underneat newer partitions / partition tables.
e.g. testdisk tutorial : exellent howto use testdisk to recover a lost partition : demo/screenshots of a recovery operation and links to backgound information.

reformatted a partition

As long as you just have been adding and removing partitions, you can recover by re-creating the original partition table. This is usually enough to also recover the filesystem originally present in that partition, because the partition table contains a marker for the filesystem type. Things change drastically when you format a partition. A "format" or "make filesystem" operation differs depending on the filesystem you're creating, and the filesystem already present on the partition. If, for instance, you re-format a linux extended filesystem to ntfs using mkfs.ntfs, the partition is initialized wiyh "all zeroes". If you re-format with ext2 or ext3, you create new inodes, so undelete tools that work by looking for inodes with a 'deleted' flag obviously won't work.

In cases like these, you're best bet is probably to try and salvage any data you can still find, before trying to restore the original partition. Suggestions :

This is a good example of why it's a bad idea to have passwords in plain text on disk, and why to be careful when getting rid of old computers or disks.

overwritten a partition by enlarging an other one

Case : on a disk that already contains partitions and a filesystem with directories and files, you decided to delete or resize partitions, or added one or more partitions. When you exit the Partition manager (and reboot), files seem to have disappeared (eg. your /home partition is gone, the Windows ntfs partition is not there anymore, or your system doesn't boot - an indication that the boot loader can't find the partition with startup boot files.

Say you have a disk /dev/sdb with two partitions : /dev/sdb1 and /dev/sdb2. At some point, you decide to get rid of /dev/sdb2 and enlarge /dev/sdb1 to span the entire disk. Using a partition editor (eg. GPartEd Live CD), you delete /dev/sdb2 and let /dev/sdb1 grow to take all remaining space. When you reboot, you notice that your home partition /home partition is empty : /dev/sdb2 was mounted to /home - you made a mistake.

Say "Oh no, what have I done ?". Then answer yourself : I've removed a partition - see if I can create it again exactly as it was.

Solution : relax. your files are probably still there - but the partition table got messed up so that your operating system does not now where or how to look for them. What you need to do is restore the partition table in its original state, so that "everything falls into place" again. You can try to achieve this by using a partition manager and re-create the exact disk / partition layout from before the disaster. In a worst case scenario, you'll have to do this by partition size. Better options are : by cylinders or by blocks. Note that you're chances for success decrease dramatically if the partitions in question have been reformatted.

Proof-of-concept : recovery of a partition that has been deleted and overwritten by enlarging an other partition

  1. backup files on existing partition so they don't get lost in the process
  2. gparted : reduce size of partition to make sufficient space for deleted partition. Ideally : restore the original size of the enlarged partition
  3. testdisk of fdisk : re-create deleted partition. don't format
  4. resize 2nd partition to its original size if you made it too small in step 2
  5. reboot and pray

all partitions have disappeared

This shouldn't be a problem if you've documented your partition table : you can just recreate it the way you fixed the partition table to recover a lost partition. If that's not the case, you can

computer won't boot

TODO : repair MBR, fix GRUB, ...


Disaster Recovery Planning

A data recovery toolset is no replacement for good backups, and disaster recovery planning should never depend on data recovery tools. Still, data recovery tools may some day become your last resort, so it's a good idea to take proactive measures to facilitate the use of such tools, such as documenting partition layout, partition sizes and boundaries, hard disk geometry, RAID and Logical Volume configuration, ... , of your servers.

You can also have a look at mondo, a disaster recovery suite for Linux. It contains tools for image-based backups and bare metal system recovery, and you can build a recovery plan around that. This is all very well illustrated on the mondo web site (Documentation) and in this article on Linux Journal that illustrates some implementations of mondo for bare metal disaster recovery, clone systems, and system restore use cases.


More ...


These are the debian packages that contain the tools discussed in this article :

	#!/bin/bash
	##
	##	 install a collection of data recovery tools
	##
	TOOLS=" 2fsprogs 	\
		e2undel 	\
		magicrescue	\
		recover 	\
		ntfsprogs 	\
		gpart		\
		parted		\
		testdisk	\
		binutils	\
		gddrescue	\
		mondo		\
		" #closing quote, end of list
	

	apt-get update
	for PKG in $TOOLS ; do
		apt-get -y install $PKG
	done
	

The following commands create a text file with information you may need to repair broken disks, partitions and filesystems. Run this when your system is still OK, and keep its output somewhere safe.

	#!/bin/bash
	##
	##	 collect information on disks, partitions , filesystems
	##
	OUTPUTDIR=/root			#move output to a safe place later !!
	(
		echo "fdisk output"
		fdisk -l
		for P in $(fdisk -l | cut -d' ' -f1 -);do fdisk -l $P; done

		echo "fstab"
		cat /etc/fstab

	) > $OUTPUTDIR/disklayout

	
	

Koen Noens
December 2005
rewritten March 2008