How We Migrated Production Infrastructure to New Servers and Accelerated the Project by 5 Times

2222

What Happens After the Servers Are Up and Running

Moving a large project to a new infrastructure rarely ends when the servers just launch. Many people think that if the website opens, the job is done. In practice, it is completely different.

Real production requires careful checking after migration. You need to reconfigure internal connections, rewrite configs, and close access to services. Most importantly, you have to find the reasons why the new environment might initially run even slower than the old one.

 We moved a large e-commerce project to Hetzner servers using snapshots. We configured a secure private network between machines, removed traps with old configs, and deployed a solution that gave a fivefold speed boost to the site. This is local Redis right on the web server.

Starting Architecture: How Resources Were Distributed

At the very beginning, the project infrastructure lived on three separate machines. The first one was the Web server, where the application code ran directly. The second one was the Database server running MySQL/MariaDB. The third one was a separate server for infrastructure services, hosting RabbitMQ, Redis, and ElasticSearch.

For the move, we decided to use snapshots of existing machines. The method is convenient. It allows you to quickly get an exact copy of a familiar environment and deploy it on new Hetzner hardware without building everything from scratch.

But there is a catch. Along with the system and files, we copy old configs. This means that all hidden dependencies, environment variables, and hardcoded localhost bonds move too. After the servers get new IPs in the Private Network, these old connections simply stop working.

What We Checked First

When the new machines started up, we did not immediately push live traffic there. We had to make sure the prod environment was truly ready.

We made a checklist for a quick audit. We went through the network settings between servers, checked the availability of the database, queues, and cache. We looked at the internal DSN lines.

Separately, we evaluated how the application behaves under technical load and measured page response speeds. Only this approach allows you to catch hidden issues before real buyers notice them.

Switchover Plan: 40 Minutes of a Strict "Red Zone"

The migration and preparation process itself lasted several hours. However, we compressed the most critical switchover phase into a tight 40-minute window. We worked according to a strict plan to minimize store downtime.

First, we moved the domain from the old IP address to the new IP in Hetzner. Next, we completely stopped the web part on the old server, shutting down fpm, nginx, and crons. Right after that, we stopped redis.

After that came the database stage. We made a final MySQL backup on the old machine and completely shut it down there. The next step was moving this fresh data snapshot to the new server and launching the database on the new hardware. Only then did we command the web server to start.

When the code and database came alive, we started copying image files from the old server to the new one, so customers wouldn't lose content.

Deployment Configuration: How We Updated the Release Cycle for the New Hardware

Once the basic services launched, a question arose regarding further development. The old deployment script was tied to the previous architecture. We had to quickly reconfigure the deployment cycle for the new Hetzner machines.

We rewrote the automation configs, updated SSH access keys, and verified how the code rolls out to the new servers. It was important to do this immediately. The development team should not wait. They need a stable and clear tool to deploy fixes to the new environment without manual tweaks through the console.

Post-Migration Traps and Solutions

The first serious issue popped up from RabbitMQ. One of the CLI commands in the application started crashing hard. In the logs, we saw an AMQP authentication error saying ACCESS_REFUSED.

It looked like the login or password had failed. But when we dug into the configs, it turned out that localhost simply remained in the DSN line for the Messenger. The web server was trying to find the message queue right next to it, even though RabbitMQ already lived on another machine. We quickly changed this value to the correct internal IP inside the Private Network, and the command started working.

Basic HTTP Access Block and Database Security

We solved security issues in several layers. For technical URLs and service panels, we quickly raised a simple firewall, closing them via Basic Auth on Nginx. This is an excellent temporary solution for utility interfaces and staging, which does not require rewriting the site's own code.

We dealt with the database more strictly. We had a clear scheme: the database server got the internal IP 10.0.0.2, and the web server got 10.0.0.3.

We completely banned MySQL from accepting external connections. We wrote firewall rules in Hetzner only for the web server's IP. In addition, we restricted the DB user itself in the database settings so that it could connect exclusively from host 10.0.0.3. If one level of defense fails somewhere, other layers will still keep the system safe from breaches.

ElasticSearch Migration: Reindexing the Catalog Without Speed Loss

Launching ElasticSearch on the new infrastructure server became a separate quest. Since we were shutting down the old machine completely, the former search indexes became irrelevant. After starting on the new hardware, the application would not be able to search for products normally.

We launched a full reindexing of the catalog. For 160,000 products, this is a heavy process that strains the system. We optimized the memory allocation (Java heap size) for the new Hetzner server specs so that the reindexing would finish fast without choking the disk subsystem. Now, the search on the site became instant again.

Why Performance Dropped After the Move

When the site was turned on, we started measuring speed. The results did not please us. On the old test server, the database gave a response around 800 ms, but here we got as much as 1500 ms.

This is a classic example showing that launching servers is not a victory yet. We started looking for the cause. It turned out that a network latency occurred between the web server and the infrastructure server where Redis initially lived. For a catalog of 160 thousand items, this "network leg" became critical because the application made too many small requests to the cache while generating a page.

Why We Moved Redis to the Same Server Where the Code Runs

It became clear that the network delay had to be removed physically. We ran an experiment, raising Redis as a local cache instance directly on the web server where the project code itself runs.

The solution turned out to be the most accurate hit. The site response speed increased by 5 times compared to the old configuration. The application began fetching hot data instantly, without walks through the local network. Heavy queries simply stopped reaching MySQL.

At the same time, we strictly tuned the Redis configuration. It listens exclusively to localhost, is password-protected in protected-mode, and we completely turned off persistence (writing to disk) for maximum acceleration. We also strictly limited the memory allocation and enabled the LFU key eviction policy.

The bot or system does not try to push all 160,000 products there. Only "hot" data lies in memory - popular items and heavy query results, while Redis cleans up everything unpopular on its own.

Incident Monitoring: How We Connected BetterStack via Slack

To control the new infrastructure in real time, we deployed a monitoring system and linked it to the workspace. We set up BetterStack, which checks service availability, server response time, and the state of private networks around the clock.

BetterStack sends all important alerts and infrastructure health reports instantly to a special technical Slack channel. We see any anomalies or workload spikes the exact second they happen. This allows us to react to potential issues proactively, before the website starts slowing down for buyers.

What to Check Next After Such a Migration

Even with a great result, we do not drop the work. Right now, we continue to monitor the system at several technical points.

We track the hit rate in Redis, check the amount of free memory, and look at how many keys fly out via LFU. We also keep tabs on the MySQL load and RabbitMQ queues during daytime traffic peaks. Proper optimization is always a process where you stabilize first, then deal with the cache, and then do targeted fine-tuning of bottlenecks.

Conclusion

This case study clearly shows that infrastructure migration is far more than just copying files and databases. The real work begins after pressing the "Start" button.

Only through a full audit of internal DSNs, liquidating old links to localhost, closing security holes, and finding the right cache strategy can you get a real result. In our case, local Redis on the web server and flexible configuration of defense layers allowed the project to fly 5 times faster, working stably and predictably for the business.

Fill out the form to receive a free consultation

Enter your name in Cyrillic or English
Enter your phone number
By clicking the button, you consent to the processing of personal data