Tech-Improvement Work Keeps Getting Killed. Here's What Survives.
Background
Since Q2 is about to end, I took some time to sort out the requirements I worked on this quarter, and then I awkwardly discovered that there was nothing to fill in under the tech-improvement column. Thinking back, it seems I really didn't do any tech-improvement work this quarter. The last time tech-improvement came up was two and a half months ago when I proposed a quick-login feature to the product team to avoid re-entering account credentials after switching accounts. The product team asked a bunch of "why" questions and then steered the conversation elsewhere, and the matter was gradually forgotten. Here, requirements must be created by the product team, so anything you want to do must first be known to them. But tech-improvement requirements are often labeled "useless" by the product team, making them very hard to implement.
This is not an isolated case. Looking back over the past two years, I've encountered similar situations at other companies: tech-improvement requirements are indefinitely postponed, or simply disappear. The management bigwigs above tacitly follow one principle: surviving is more important than living comfortably. You often hear this debate, with one side—product and management—feeling that:
Tech-improvement is just programmer self-gratification. So-called refactoring is merely rewriting already-working code, essentially a waste of company resources.
The other side is us programmer worker bees:
If you don't pay down technical debt, the code eventually becomes unmaintainable, and then no one can deliver anything.
Both sides have a point, but both are also a bit too absolute. So the topic this article discusses is: Should tech-improvement requirements be done at all nowadays?
What Exactly Are Tech-Improvement Requirements?
Before discussing, let's think about what kind of requirements can be called tech-improvement requirements. Let's make a distinction. In my personal opinion, tech-improvement requirements fall into four categories, and their input-output ratios differ.
Architecture Upgrades
Typical examples include migrating from MVP to MVVM, from traditional Views to Compose, modularization, and componentization. Such requirements often take a long time, roughly six months to a year. Previously at a fintech company, five people spent a full year on componentization. It was good that everyone stayed determined throughout; otherwise, it could easily have been abandoned halfway. This is the biggest risk of such requirements: they are prone to dying mid-way.
Infrastructure Optimization
The most common examples are CI/CD speed-ups, unit testing, and setting up monitoring and alerting. Compared to the requirements above, these take less time and their effects are immediately visible, so the risk is relatively small.
Code Nitpicking
Every team has its own coding standards and periodically asks members to check if variable naming is non-standard, if camelCase is used correctly, or if ternary expressions should be changed to if-else. This kind of requirement yields almost no benefit. Although there is little risk, there is also little necessity to do it.
Performance/Stability
This is probably the most frequently done type, such as ANR governance, memory optimization, and startup speed. These requirements are quantifiable and visible. Although there is some risk—fixing an ANR issue might turn an occasional occurrence into a guaranteed one—they genuinely have value in terms of manpower investment.
In summary, I think the first two need to be done. The third, code nitpicking, can be done incidentally but shouldn't have a dedicated requirement created for it. Many teams now use a hook to check code standards upon commit, which is a more correct approach. The last one can be integrated into business development.
Why Has Cutting Tech-Improvement Become the Norm?
Above, I categorized common tech-improvement requirements. You can see that not all tech-improvement requirements are meaningless; some are worth doing. But in the current market climate, why do we see so many examples of tech-improvement requirements being cut?
The Macro Environment Has Changed
Think back to before 2020, when the mobile internet was in an expansion phase. The practice of big companies was: do business first, debt is fine, pay it back slowly later using the traffic dividend. Wasn't that the case? This was a positive cycle, where the revenue brought by business growth covered the cost of tech-improvement.
But later, around 2022, the market started to worsen. Traffic peaked, and revenue growth slowed or even declined. The previous tactic of covering costs through business growth stopped working. At that point, when the boss again saw a tech-improvement requirement that took 5 people 3 months to complete, and the benefit brought was a 30% improvement in code maintainability, not the 30% revenue increase the boss wanted to see, any normal boss would want to cut such a requirement.
Many Tech-Improvements Truly Produce No Output
Honestly, if you recall the tech-improvement requirements you had cut in the past, ask yourself with a hand on your heart: should they have been cut?
- Claiming to refactor a page, but really just changing an Activity into a Fragment, swapping the shell while copying the internal logic over. The tech-improvement requirement was done, but it was practically useless, and might even cause an online crash.
- Just wanting to practice, like learning Hilt or coroutines, by rewriting some stable code and introducing a new framework, which genuinely increased the learning cost for other developers and the testing/regression cost.
- Unlimited scope expansion: starting with just wanting to optimize the login logic, and ending up changing the underlying storage structure. The more you change, the bigger the holes you poke, and after going live, the boss directly curses you out.
Managers Are Starting to Understand Technical Measurement
In the past, you could just tell the boss words like code quality, architectural rationality, and maintainability, and the boss would basically nod in agreement. Now, more and more managers have learned to look at data:
- You say maintenance costs will decrease after refactoring; please provide specific data on how much the cost has decreased.
- Code quality vs. user conversion rate: which is more important?
- How many online ANRs will this refactoring reduce? Specifically in which scenarios?
When the measurement standard for many tech-improvement requirements shifts from qualitative to quantitative, many tech-improvements probably become too embarrassing to even propose. This is a good thing; it pushes us to do some truly meaningful tech-improvement.
How Should Tech-Improvement Requirements Be Done Going Forward?
Tech-Improvement Is the Tech Team's Own Business; Don't Compete with Business for Time
Just like the example I gave above, how do many teams do tech-improvement? They first propose a requirement to the product team and compete for scheduling resources with business requirements. If you were the product manager, how would you see it? The business side is urging you every day, and scheduling time is already insufficient. Why would they care about your so-called maintainability and extensibility? No need to measure; tech-improvement naturally loses. From the company's perspective, business requirements have clear KPIs, while tech-improvement KPIs are vague. You can't win that fight.
I believe the correct approach is: tech-improvement is an internal matter for the tech team, just like regularly cleaning the house or scooping the cat litter at a fixed time every day at home. There's no need to apply to outsiders.
Tech-Improvement Should Be Done in Phases, Not All at Once
Like my previous requirement to adapt to 16kb page sizes, the total estimate was 26 person-days. I didn't just do that for 26 days straight. I broke it down into several small requirements, doing a bit, testing a bit, and releasing a bit. This phased approach is slow, but it avoids being stopped halfway through or having a pile of problems discovered all at once at the end.
The Arrival of the AI Era Requires Redefining Tech-Improvement Requirements
We now use AI for development every day. What does this mean for tech-improvement?
The Good Side
- Using AI to write code, unit tests, and documentation can improve tech-improvement efficiency.
- Using AI for refactoring—splitting modules, migrating APIs, changing naming—tasks that previously took a person at least several days can be done by AI in half a day.
- For tasks that consume massive manpower, with AI boosting efficiency, the boss might turn a blind eye.
Aspects to Watch
- The quality of AI-generated code is uneven, increasing the cost of code review and testing, and potentially introducing new technical debt.
- Don't go on a massive tech-improvement spree just because AI is convenient, or you'll get a "surprise" after going live~
Overall, AI gives infrastructure and performance-related tech-improvement requirements room to survive.
A Simple Summary
As working professionals, for those few taels of silver each month, why do so many thankless tasks? Working smart, not just hard, is the true way. The above are all personal opinions, for reference only. No offense if you disagree.