Skip to main content

Migrate your (WordPress) website into a Microservice

Greetings

We recently migrated our WordPress website into our microservices stack to maintain it within the same stack. It was a fun experience for me hence I'm sharing it with you so that you can learn from it.
I will provide a high-level thinking process and the reasoning behind decisions.
Note: The majority of this is still valid for non-WordPress websites as well.

Option 1 - Direct database access

WordPress uses MySql as the database to store posts. If you have full access to the database you can directly access it and fetch using your own queries. However, this was not our case luckily haha. Unless we would not get this learning experience.

Option 2 - Selenium

We started with the selenium web driver that runs through the Chrome web browser. Selenium gave us a perfect solution that scraped all necessary information.
With selenium, we can navigate through pages, click on links and navigate to sub-pages. When we navigate to the desired page we can utilize selenium-provided APIs to grab data. It provides us, with multiple selectors, to find elements and select the elements' data.
You can use any similar tool but this is what we know hence the fastest.

The knowledge you need and you can learn by doing is a lot. You need or learn the below technologies.
  • Java or JavaScript
  • Selenium basic
  • Selenium element selectors
  • CSS selectors
  • XPath selectors
A nice throwback for me with selenium after a very long time.

As the selenium option will run through the browser we tried WordPress API because we needed to plan an ETL job as well.

Option 3 - WordPress API

WordPress provides a comprehensive REST API to create separate interfaces using WordPress-hosted data.
WordPress comes with multiple APIs and pagination options as well. One of the most important details is the number of records that come as a header (X-WP-Total, X-WP-TotalPages). Total give us the total posts. Also, note that WordPress implements HAL partially. It is just a matter of creating a for loop based on those values.
Nodejs is an ideal option to manipulate JSON data hence it is a straightforward decision.

Here are the things you need to know and learn from this.
  • WordPress API
  • Nodejs and JavaScript
  • Axios or got library
  • Cheerio or jsdom
  • DOM manipulation using CSS selectors and XPath
  • REST HAL manipulation
Cheerio is a nice addition to me as it reminds me of our old friend jQuery.

Even though WordPress is the standard it strangely did not return a few of our properties.

Option 4 - Combination

We could eliminate the above options' disadvantages by combining them. As WordPress response includes links to the resources hence running a selenium job is not necessary. Instead, we can utilize HTTP calls with DOM parses. Hence used HTTP calls using Axios and used Cheerio to parse and manipulate the DOM.
The technical stack you need for this is the same as Option 3.

Which option to use?

There is no best solution and it depends on various aspects. Is it a one-time migration or do you need to sync data regularly so that you keep both websites open but provide different user experiences?
  • What is the engineering cost?
  • Hosting cost
  • Time complexity, both development and ready to market
  • How competent is the team on the technical stack
  • How do you maintain it in future
Experience matters when taking such decisions as we need to drive the businesses. Especially when the time is limited we need to go in the quickest possible path enabling future enhancement as well.
Our decision includes running the job ones as an MVP solution and letting it evolve as an ETL with time.

Data sync

If you want to keep both websites open until you discard them, you have to sync WordPress latest updates into microservice. How can we do that? This is when we can utilize the "modified" property in WordPress response. Compare the date with the last job run time and sync the difference. If a new post is added, the "created" and "modified" properties would be the same.

Bonus - Full stack developer project

This is a fun and beautiful project for any passionate developer to work on. You can learn a lot. Here is a high-level architecture and the tech stack you can use. My expertise is in Java and JavaScript. Hence I'm suggesting a polyglot solution.
Use toscrape (or a similar) website as the source.


That's it guys. I will provide a serverless solution for this in an upcoming article.

Resources

WordPress API - https://developer.wordpress.org/rest-api/
Selenium - https://www.selenium.dev/
Cheerio - https://www.npmjs.com/package/cheerio
jsdom - https://www.npmjs.com/package/jsdom
Axios - https://www.npmjs.com/package/axios
got - https://www.npmjs.com/package/got
Nodejs
Spring boot
JavaScript
toscrape - https://toscrape.com/  

Comments