Rambling thoughts and such like

Hello and welcome to the CogBlog. Here you'll find the (mostly) informative ramblings of the CogWorks staff on life, the Universe and Everything.

Subscribe to the RSS feed

Migrating to Umbraco via CMS Import

For those of you who have been following my twitter feed will have seen over the last few weeks some interesting tweets regarding migrating from Immediacy CMS to Umbraco.

We had an old client who was running Immediacy 5.1, this CMS is no longer supported and client wanted to move to Umbraco but wanted to migrate existing content (html,media and members) over to Umbraco.

Enter the fantastic CMSImport 1.1 created by Richard Soetman. This excellent package made the migration process a breeze.

Out of the box it can migrate content and members from numerous sources.  In this instance the source was a SQL server database the query below (updated p_parent_id for each section of the site, currently CMSImport cannot do hierarchical import) was used on each section of the old site to retrieve content to import

query.jpg

The pro version enables import of any related media from imported content,

mapping-step1.jpg

you have to ensure that the images and files referenced are copied over to the root of your Umbraco installation in my case I had docs, pdf,images,system_images folders in Immediacy.

I copied these over so that during the import CMS import could pick up and import those files/images.

However one issue that cannot be handled out of the box and I guess is an issue with any migration tool is recreation of links between content pages.

Just as Umbraco in the backend maintains content links using its own syntax namely {localLink:1234} Immediacy also does something similar i.e ILINK|1234 so although all the content can be migrated and existing links would still work as links are created using page names, in the Umbraco backend those links would be stored as /about-us/somepage.aspx.  This would mean initially the links would work however if at any point the target page was renamed or moved then republished its url would be different and links would be broken.  So how do we migrate over all internal content links?

Thankfully CMSImport has a really nice feature called FieldAdaptors. It's an interface you implement and for a given datatype (in this case rich text field) you can before actual save of newly migrated content perform some processing.

So I created my own field adaptor, for my document type that I was migrating to I created an extra temporary field called Immediacy Original Id.  During the migration setup I mapped the Immediacy page id to this field.

mapping-step2.jpg

In my fieldadaptor I loaded the html content into an Htmlagility pack document then using xpath retrieved all anchor tags that contained file links (.doc,.pdf) in the href attribute.  For all links that contained a file link I made use of method MediaImportHelper.ImportMedia which is part of CMSImport pro api.  This imported the Media and fixed the link so that it pointed to a valid Umbraco media item.

I then in the field adaptor processed list of all Immediacy links (basically got all links with ILINK| in the href attribute), using an Immediacy database stored procedure (immediacy_GetPagePath Immediacies equivalent of NiceUrl) I converted the ILINK to a full link e.g /about-us/somepage.aspx and I also tagged onto the link the following querystring ?immPageId=1234 where 1234 was the original Immediacy target page Id.

The first migration pass gave me all content and Media imported.  The content urls worked as page names were the same as Immediacy urls, however they were not proper Umbraco internal links.

I then hacked the excellent content maintenance dashboard control written once again by Richard Soetman (no stopping this lad!) and added an extra button to it called fix Immediacy Fix Links.

content-maintenance.jpg

This when clicked for selected documents gets the bodyText field content extracts all links that contain ?immPageId=1234 and for the Immediacy page id makes an Umbracoexamine (after initial migration pass ensure the internal index is up to date, I did a manual rebuild) call into the internal index and gets the document with that original Immediacy value.  From that it gets the Umbraco page id and updates the link from

/about-us/somepage.aspx

to

{localLink:1234}

thus all internal links are converted to Umbraco links.  You could have done the umbraco document lookup using linq2umbraco or xml xpath however you would need to publish all content first, also it would not be as fast as Lucene.

One side effect of this is that after the update of links and subsequent save and publish, the update date of documents is set to date of migration and original update date from Immediacy is lost (this effects ordering of news lists that are dependant on update date for sorting).  However a second pass migration with just update date field mapped will fix this problem.

After performing the migration I copied over the templates html and removed Immediacy plugins and replaced them with Umbraco macros  e.g left navigation, sitemap list news the usual suspects that no doubt all of you have created during your Umbraco careers. The site had a couple of forms and I recreated those using Umbraco contour. One of the forms was registration form so I created a custom workflow to create the new member upon form submission.  The old site had extranet facilities and using CMSImport I migrarted all 300 members over and setup the protected area via Umbraco public access.

I also setup UmbracoExamine with PDF indexing so we now have a proper search as opposed to Immediacy default Index server we now have Lucene.net with search word highlighting.  Still a few bits todo on the site but will post link to live site when its up.

And that is how you migrate from Immediacy to Umbraco!!

I would like to thank Richard Soetman for all his help on this project without CMSImport this would have been a big job.

This method could in theory be used for other CMS migrations, if you can get to the CMS content (its in database or xml files or csv) and you have a way of accessing the link generation method used by the CMS then this would work.

Please note updating content and migrating files/images can only be done with CMSImport 1.1 pro.

3 comments for “Migrating to Umbraco via CMS Import”

  1. Posted 05 November 2010 at 14:23:09

    High 5 you rock!

  2. Posted 05 November 2010 at 16:54:27

    Awesome post Ismail,

    Excellent example how CMSImport can be used to migrate from other cms's to Umbraco. Very well written, love the detail!

    High 5 you rock my friend!

  3. Posted 11 November 2011 at 10:57:46

    Great. I used to work with Immediacy and couldn't wait to get away from it too when I discovered Umbraco.

Post a comment

Back to top