Tutorial 2: Anonymize your DICOM with the power of Ruby!
Performing simple and advanced DICOM anonymization with ruby-dicom
Admittedly there are a lot of (free) programs out there that will allow you to anonymize your DICOM files. Some of them will cost money though if you want to unlock more advanced features. With the Anonymizer class in ruby-dicom it's all free of course. It tries to offer everything from the quick and simple to the more advanced, customized, for all of you hard-core anonymizers out there! Please have a look at its features. Chances are it will probably cover your needs, and should you discover that it falls short: Take a look at the source code, work with it! Ruby code is surprisingly easy even if you dont have a lot of experience with it.
Alright, let us start with a list of what you'll need for this tutorial:
RMagick (Optional: Only needed if you want to edit the image data)
This tutorial page will actually contain three separate tutorials. We will start with a simple anonymization example, then we will move on and use more advanced features like enumeration for the anonymized values, and for the last tutorial, we will go off the hook and anonymize burned-in information in the image data as well as burn in a small signature ourself!
Part 1: Perform a simple folder anonymization
Now of course, if you're just going to do a simple, run of the mill anonymization of a few DICOM files, you probably might as well just open of the many GUI programs that are available out there and get done with it. None the less, in order to get familiar with the Anonymizer class of ruby-dicom, we will start off with a simple example: Anonymizing all DICOM files contained in a given folder (including any subfolders), performed in four, small steps. Lets have a look:
$ require 'dicom'
$ a = DICOM::Anonymizer.new
Thats it! The execute method will print some information to the screen as it proceeds through the anonymization process. Your output might look something like this:
irb(main):005:0> a.execute ******************************************************* Initiating anonymization process. Searching for files... Done. 2 files have been identified in the specified folder(s). Separate write folder not specified. Will overwrite existing DICOM files. Initiating read/update/write process... Anonymization process completed! All files in specified folder(s) were SUCCESSFULLY read to DICOM objects. All DICOM objects were SUCCESSFULLY written as DICOM files (2 files). Elapsed time: 0.1 seconds *******************************************************
Now I figure you might ask: Which tags were really anonymized here? Finding that out is quite easy, just call the print method and you'll get a small list answering just that.
This method prints a list of data elements that are selected for anonymization. The list contains data element tag, name, type and the value which will be written to the anonymized files.
irb(main):005:0> a.print 0008,0012 Instance Creation Date DA 20000101 0008,0013 Instance Creation Time TM 000000.00 0008,0020 Study Date DA 20000101 0008,0023 Content Date DA 20000101 0008,0030 Study Time TM 000000.00 0008,0033 Content Time TM 000000.00 0008,0080 Institution Name LO Institution 0008,0090 Referring Physician's Name PN Physician 0008,1010 Station Name SH Station 0010,0010 Patient's Name PN Patient 0010,0020 Patient ID LO ID 0010,0030 Patient's Birth Date DA 20000101 0010,0040 Patient's Sex CS N 0020,4000 Image Comments LT
Part 2: Advanced anonymization with enumeration and identity file
In the first example we just performed a simple anonymization using default settings. However, there are quite a few more 'advanced' settings available in the Anonymizer class that you might find useful. Such features include:
- Specifying a separate write path to store the anonymized files, instead of overwrite your existing files.
- Blanking all data element values instead of assigning a custom, general string value.
- Add several folders to be anonymized (along with their sub-folders).
- Specify one or several sub-folders to be exempted from anonymization.
- Add or remove tags from the list of tags that will be anonymized.
- Specify that all private tags are to be deleted.
Having talked a bit about features, it's time to get on with our example. What we are going to do here is make a small script that will anonymize a DICOM directory with enumeration and create an identity file. What, exactly, is that, you may ask? Let me explain:
With enumeration enabled, the anonymizer class will keep track of the values it encounters for all the data elements where this feature has been requested. In the first DICOM file, the value of the "Patient's Name" data element will be changed from say, "John Doe", to "Name1". All subsequent DICOM files that contain this name, will receive the same "Name1" value for that particular data element. When another name is encountered, the anonymized value will be bumped to "Name2". If several sets of DICOM files contain the same "Referring Physician's Name" they will all be anonymized with a value such as "Physician1".
In addition to using enumeration and identity file, we will tell our Anonymizer to delete all private tags from the anonymized DICOM files (You never know what might be hiding in those private tags, right?).
# Load the ruby-dicom library: require "rubygems" require "dicom" # Load an anonymization instance: a = DICOM::Anonymizer.new # Add the folder to be anonymized: a.add_folder("/home/dicom") # Request private data element removal: a.delete_private = true # In addition to the default selection of tags to be anonymized, we would like to add "Operator's Name" as well: a.set_tag("0008,1070", :value => "Operator", :enum => true) # Lets leave out "Institution Name" from the anonymization process: a.remove_tag("0008,0080") # Select the enumeration feature: a.enumeration = true # Specify a file to keep track of the identities behind the enumerated, anonymized values: a.identity_file = "/home/identity.txt" # Avoid overwriting the original files by storing the anonymized files in a separate folder from the original files: a.write_path = "/home/write" # Print the list of selected tags just to verify that everything is correct: a.print # Run the actual anonymization: a.execute
Lets name it anonymize.rb, execute and see how it goes. Your terminal output might end up looking something like this:
user@home:~/ruby/dicom$ ruby anonymize.rb 0008,0012 Instance Creation Date DA 20000101 false 0008,0013 Instance Creation Time TM 000000.00 false 0008,0020 Study Date DA 20000101 false 0008,0023 Content Date DA 20000101 false 0008,0030 Study Time TM 000000.00 false 0008,0033 Content Time TM 000000.00 false 0008,0090 Referring Physician's Name PN Physician true 0008,1010 Station Name SH Station true 0010,0010 Patient's Name PN Patient true 0010,0020 Patient ID LO ID true 0010,0030 Patient's Birth Date DA 20000101 false 0010,0040 Patient's Sex CS N false 0020,4000 Image Comments LT false 0008,1070 Operators' Name PN Operator true ******************************************************* Initiating anonymization process. Searching for files... Done. 1046 files have been identified in the specified folder(s). Processing write paths... Done Initiating read/update/write process (This may take some time)... Anonymization process completed! All files in specified folder(s) were SUCCESSFULLY read to DICOM objects. All DICOM objects were SUCCESSFULLY written as DICOM files (1046 files). Writing identity file. Done Elapsed time: 55.0 seconds *******************************************************
Now, part of what we did here, was to create an identity file that enables us to re-identify the anonymized dicom files at a later time, if we wish so. Lets take a look at the content of the identity file that was created for us:
0008,0090 Physician1;Dr Feelgood Physician2;Sam Surgery Physician3;Pat Placebo 0008,1010 Station1;CT1 Station2;CT2 0010,0010 Patient1;Peter Pelvis Patient2;Jackie Chaan Patient3;Indiana Jonas Patient4;Sara Sirius Patient5;Anna Sahara Patient6;John Doo Patient7;Richard Sick Patient8;Tommy Tumor 0010,0020 ID1;123456 ID2;654321 ID3;666444 ID4;444666 ID5;010110 ID6;098765 ID7;567890 ID8;011001 0008,1070 Operator1;John Operator Operator2;Jack Buttons Operator3;Connie Computer
Part 3: Anonymizing burned-in image data
For this final part, we will manipulate the pixel data itself. This functionality is not built into the anonymizer class, and so we will be working directly with the DObject class instead. Our task here will be to wipe out any text that has been printed directly in the pixel data, and just to show off, we'll annotate the DICOM image with a little signature of our own.
In this example I will use a DICOM file that I found somewhere on the web. As such, its burned in content is not sensitive, but the principle remains: To show how to remove burned in text from the image. The original image is displayed below. In case you are unfamiliar with RMagick, I'll show you how the DICOM image data was extracted and converted to a jpg image:
require 'dicom' require 'RMagick' dcm = DICOM::DObject.read("myFile.dcm") image = dcm.image image.write("dicom.jpg")
What we are going to do now is make a small script that does the following:
- Reads the DICOM file and extracts the pixel data as a RMagick object.
- Defines a number of rectangles to cover the various areas of the image containing burned in text.
- Iterates through these (black) rectangles and paint them to our image object.
- Creates a bit of annotation and puts that in our image object.
- Puts the RMagick object back into the DICOM object and saves it to file.
require 'dicom' require 'RMagick' include Magick # Load file and image: dcm = DICOM::DObject.read("/home/dicom/burned.dcm") # You might (or might not) want to normalize the gray scale: image = dcm.image.normalize # Create an array of 'black' rectangles: rectangles = Array.new rectangles << [0,0,300,100] rectangles << [411,0,511,100] rectangles << [0,380,100,511] rectangles << [411,390,511,511] rectangles << [0,420,511,511] # Paste black rectangles on top of image: rectangles.each do |r| erase = Draw.new erase.fill "Black" erase.rectangle r, r, r, r erase.draw(image) end # Insert annotations: text = Draw.new text.fill = 'White' text.pointsize = 14 text.annotate(image, 0, 0, 10, 30, "Anonymized by:") t = Time.now text.annotate(image, 0, 0, 10, 45, t.strftime("Processed %m/%d/%Y")) text.font_style = ItalicStyle text.annotate(image, 0, 0, 130, 30, "The Cool Admin") # Insert pixel data back into the DICOM object and write to file: dcm.image = image dcm.write("/home/dicom/anonymous.dcm")
Now, let's have a look at the resulting DICOM image. Using the technique detailed earlier for extracting the pixel data and saving them as an image file, we get the following:
Hopefully, as of this moment, your anonymization skills should be sufficient to tackle whatever the local DICOM disciples at your office throw at you. Using the scripting power of the Ruby language, along with ruby-dicom and RMagick, you should be able to do pretty much anything you want with your DICOM files as far as anonymization is concerned.
I hope you have found this tutorial helpful, and as always: All feedback is appreciated!
Published: March 30th 2009
Last updated: May 6th 2012
chris.lervag @nospam.com @gmail.com