Tuesday, October 30, 2012

How to recover deleted document

If you want to recover an object deleted in the Documentum repository, you have big chances to recover it's content. Here you'll find the steps to recover a document content even without having many details about it.
Object metadata can be recovered only if you have a database backup, made before the deletion.
Anyway, usually the most important thing is the content itself, not the metadata, so we'll focus on the procedure of recovering the content of removed document.

The first and most important thing to do is to disable the dm_DMClean job, which cleans up orphaned objects, including the content ones. Check the job last execution time: if it ran after the document was deleted, I'm sorry - the content is lost (well, if you have both DB & content backup you can recover anything you want).
Also check the job dm_DMFileScan, usually it's disabled, if it's enabled you'd better disable it untill you recover your document.

Next, our task is to find the dmr_content object which has information about the content location.
As there might be thousands of orphaned content objects, try to get as much information as possible about the deleted document:
1) Date/time of deletion and user who deleted the document
2) Date/time of creation / last modification of content (checkin)
3) File format, aprox. size, object name

The query to get the content objects having no associated metadata objects (dm_sysobject) is:
select * from dmr_content where any parent_id is null

The problem is this query most probably will give you too many results, but I guess you don't want to find the right document when you reach 65 years :)

Now let's see how this information can help you to narrow the results:
1) Date/time of deletion and user who deleted the document
Hoping you have auditing enabled, you can get some information from this audit:
select * from dm_audittrail where event_name='dm_destroy' where time_stamp > date('some date before deletion') and user_id = (select r_object_id from dm_user where user_name='USER_WHO_DELETED')

From the results returned, if you find a record that seems to represent the deleted document, grab the object_name value

2) Date/time of creation / last modification of content (checkin):
select r_object_id,full_format from dmr_content where any parent_id is null and set_time > date([time before creation]) and set_time < date([time after creation])

3) File format, aprox. size, object name (possibly grabbed at step 1):
select r_object_id,full_format from dmr_content where any parent_id is null and full_format='[FORMAT]' and content_size > [MIN_SIZE] and content_size < [MAX_SIZE] and set_file like '%[OBJECT NAME]%'

Note: You can combine filters from point 2 & 3 if you have this information. The more filters you use, the less results you'll have.

Ok, so now you have a list of content objects (hopefully not too big). Now you can get corresponding paths to the files, on the file storage.
For each id in the list generate a DQL command:
execute get_path for '[ID]'
where [ID] is the r_object_id of dmr_content object

Executing the obtained script you get a list of file paths. Copy the results to a file.
Now you have 2 options to getting the files:
1) You can get the content files directly (without creating objects in repository), by obtained paths. Generate a script that will copy all the files to your folder. For example, you can use commands like:
cp [OBTAINED PATH] /target_folder/[COUNTER].[FULL_FORMAT]

where COUNTER is a counter (1..n) - to not have name conflicts during the copy operation.

2) Create new objects in the docbase by generating a DQL with queries like:
create my_type object set object_name='Some identifier', link '[Folder path]', setfile '[PATH]' with content_format='[FORMAT]'

If you recovered more documents, you can open them and find the one you were searching for.
Once you're happy with having recovered the deleted document, don't forget to enable back the dm_DMClean job if you disabled it.

No comments:

Post a Comment