The nightmare that is Disqus export, and import into WordPress


Sorry for the relative quiet the past few days. I have taken it upon myself to try to help Jen McCreight port all her Disqus comments back into a format WordPress can work with, so that she can complete her migration and be one of our happy neighbors. The problem is relatively convoluted, though. First, it appears that my Disqus export file is not in any format that any existing WordPress plugins know how to deal with. Second, it appears that attempting to sync the Disqus database against a temporary WordPress install just caused Disqus to break blaghag.com, deciding to no longer show any comments on any existing threads.

The problem now involves mapping what Disqus thought each post name would be, with the actual post name on the WordPress database. I’ve gotten most of the problems sorted out, with a few exceptions that simply don’t appear anywhere on the database in any form. There’s a distinct possibility that when I start importing these comments, it’s going to rearrange some of them onto similarly named posts. I’m doing my best to avoid that sort of situation, but I only have so much foresight. At least I’m down to about 300 unidentified comments, which isn’t bad at all.

The next thing I have to do is figure out a way to map child comments to their parents and preserve threading. What… a… mess. Disqus has basically made sure that once you’re on their service, you’re not leaving. Fair warning.

So, my personal non-work time will be mostly spent ignoring the world and hacking away at this XML file until I can import it into WordPress and save the roughly 37,000 comments that have been saved only on Disqus and never to Jen’s Blogger database. And I’m mostly doing this because Disqus should have made this easy by using the generic WXR format, but they haven’t, they went with their own schema. I’m out to prove this kind of import can be done. It’s personal now.

Share your Disqus horror stories, or WordPress database-hacking stories here. Who knows? Maybe you’ll have a solution I haven’t thought of.

Comments

  1. says

    Well, honestly, I was the most likely suspect.

    Despite freaking hating Disqus and everything they stand for, I have successfully imported all 45142 comments (plus ~300 in the pending pile) into WordPress. Blag Hag’s coming home, intact. I even preserved threading.

  2. Daniel Schealler says

    As a developer myself, I salute you for dedicating so much of your free time to this endeavor…

    But on the off chance this condition turns out to be catching-

    *woosh*

    *poof of Daniel Schealler shaped smoke*

  3. says

    Hi Jason.
    Disqus definitely does not make it easy for you to leave. I’m working on trying to do this right now as well. We migrated our blog from Movable Type to WordPress. Comments were on Disqus. We are actually migrating 6 old blog domains that were MT onto 1 new blog domain on WP. Trying to get the 6 separate Disqus profiles over to the new site has been less than pleasant. Originally I thought we’d stick with Disqus and migrate everything, but it’s been such a pain I’d rather us just go with standard WP commenting at this point so trying to find a way to export them in. Would love to connect with you about how you did it.

  4. says

    Adam: I hate to be a jerk about this, but given the amount of time I spent on it, I’d love to see some kind of return on investment. If you’re interested in donating a small amount (say $10), I could give you my PHP code, kludgey though it is. I can give you my general methodology for free though.

    First I built a temporary WordPress site that I could “throw away” later, on a home server. I had to edit the post slugs for each imported blog post to match the ones in the Disqus export file, sometimes editing both, to preserve which posts got which comments. Then, using PHP, I employed the SimpleXML tool to load the Disqus export file into an array. I then iterated through the import -> Disqus -> Posts array and directly inserted each comment into the WordPress database. To preserve parents, I mapped the comment ID to the Disqus comment IDs, so when it pointed to a parent it correctly nested the comments.

    Then I used the WordPress export tool, then another Python-based WXR Splitter tool to split up the exported file into bite-sized chunks that I could e-mail to Jen for her to import. When imported, all the parents would be preserved, though the comment IDs would change from the super-high-numbers Disqus uses to something more sane for the local blog without smashing the threads flat.

    Clear out your WordPress database by truncating the wp_comments and wp_posts tables, import the posts from the next blog’s Movable Type database, then do the same with the next blog. It’s a significant amount of heavy lifting but it’s well doable (as you can tell by the existence of every single bloody comment on Jen’s blog).

    Drop me an e-mail if you’re interested in throwing me some coin for the code. :)

  5. says

    I was under the impression that Disqus comments lived on Disqus servers and a copy is saved in my local WordPress database. So even if I decide to stop using the service I don’t lose any comment data.

    For example, if I post a new comment on a post WordPress still has a copy of the comment even though Disqus keeps a copy with all the meta data etc…

    Can someone clarify this for me?

  6. ryan says

    Same problem. Will you do the import for me as a freelance job? (Very small file… 112kb) … just seems lame to toss the comments.

  7. Minty says

    I am having serious problems with this as well. Do you offer this as a service? I don’t have many comments to transfer..less than 100.

  8. Luke Hoyer Millar says

    Same here. Just changed from Blogger using disqus to WordPress. Cant get my old 150 comments into WordPress. Surely this cant be right Disqus

  9. Alex says

    Hi!

    I have the same problem… Did you manage to create a script that can parse the disqus xml file and places them to the WP database?

  10. says

    Did this person not come from WordPress? If you’re using disqus on wordpress, all the comments are in your wordpress database. You just uninstall the plug in. Done.

    Why was this so complicated?

  11. says

    @Noah: No, they didn’t.

    Incidentally Jason, when I went looking at some old threads at Jen’s, not only had your script on occasion duplicated comments (so each individual comment in a thread appears twice, once with formatting preserved, once without) but in some instances entire threads were duplicated (thus quadrupling the comments!). If you want I can point you at a thread or two I posted in, pre-boobquake, where I noticed this.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>