In this year’s Stack Overflow Developer’s Survey 64,000 developers laid bare their innermost developer secrets. How they learn, the tools they use, what they want - it’s all there. The whole survey is a fascinating foray into the brains of developers and the global software industry.
Some of the most common programming languages, Stack Overflow 2017
And yes, it’s true C and C++ are both nearly as old or older than SQL, but even combined they still aren’t as prevalent as SQL is today. This is likely explained by the database technology developers use:
The database technology developers use, Stack Overflow 2017
As you can see SQL databases dominate the top spots when it comes to database technology. Although they are far from being the most loved, with the exception of PostgreSQL:
The most loved database technologies, Stack Overflow 2017
And the SQL language itself doesn’t get much love, despite being one of the most common languages:
The most loved programming languages, Stack Overflow 2017
But there’s an upside: it’s one of the least dreaded languages…
The most dreaded programming languages, Stack Overflow 2017
OK, so survey data tells us SQL is in widespread use but a bit ‘meh’…
So why do we still use SQL?
The simple fact that both arrived early in the life of computing, and that for 90% of the time they just work, means databases have become a ‘solved problem’ you no longer need to think about.
It’s like how MailChimp has become synonymous with sending email newsletters. If you want to work with data you use RDBMS and SQL. In fact, there usually needs to be a good reason not to use them. Just like there needs to be a good reason not to use MailChimp for sending emails, or Stripe for taking card payments.
But people do use other other email automation software and payment solutions, just like people use NoSQL databases. Yet even with other database technology available, albeit less mature technology, SQL still reigns and reigns well.
So, finally, here are 8 reasons we still use SQL 43 years after it was first cooked up.
1. Simple mathematics
SQL was designed specifically for data so–surprise, surprise–it excels at accessing and organizing data.
Reason one: SQL is damn good at what it does.
RDBMS have been around for a while so they’ve been used in many, many different scenarios. From pre-web offline databases to heavily-modified SQL databases playing a central role in global apps like Facebook - RDBMS and SQL are battle-tested and have proven to be reliable after countless millions of hours running in production.
There’s a lot to be said for software that just works, especially when you’re dealing with data and databases where losses, corruption, or failure are catastrophic. Edge cases often benefit from mature solutions with numerous plans patterns for backups, change management, and operational rigor.
Hence a SQL database is nearly always the best choice.
3. Knowledge and community
When things are around for a while a general body of knowledge is built up around them. SQL is no different. Over the years a vast array of shared SQL knowledge in the form of documentation, thriving communities, and plenty of technical talent has developed.
Such a vast body of information with an active community around it does a lot to keep a technology around. Because the community is so active and the documentation so extensive, people and businesses gravitate towards the technology. Because people gravitate towards the technology the community grows and the level of knowledge deepens and is shared with new adopters.
Over the years, this is what’s happened with SQL.
As far as languages go, SQL is easy to learn. It can take just a few days to learn the limited number of functions one can use to run queries and return data. Simple.
Even roles that are traditionally non-technical such as marketing, C-level executives, and non-technical product managers are known to learn basic SQL to support their roles.
“SQL - it’s so easy marketers can learn it.”
Deeply understanding the relational database systems that SQL runs on is another thing. But for a vast majority of simple data queries, SQL is great.
With half of developers using SQL and RDBMS it’s safe to say the language and technology is ubiquitous. This is no bad thing. As mentioned above, knowledge and community thrives in this situation. And due to its simplicity, SQL is almost common knowledge among developers and those they work with.
This means skill sets easily transfer between companies and industries, which means talent is readily available, which in turn fuels knowledge creation and community growth.
The ubiquitous nature of SQL databases has formed a beneficial circular model for growth and its fantastic.
6. Open Source and interoperability
Generally, SQL isn’t completely interoperable. Vendors aren’t known for following the same standards (see Oracle or MS SQL Server), largely due to differing syntax. However, SQL syntax varies only slightly between vendors so it’s still possible to reuse SQL with some modification. But this isn’t ideal and some vendors would rather their syntax wasn’t reusable…
The database technology developers use, Stack Overflow 2017
In fact, PostgreSQL is one of the most desired database technologies of 2017, albeit by only a fraction and vastly outpaced by MongoDB:
The most desired database technologies of 2017, Stack Overflow 2017
This shift towards open source SQL databases is great for the already thriving community and means SQL benefits from a community actively trying to improve the technologies they use.
7. Why code when you can use SQL?
There’s a saying that goes something like…
Don’t do in code what you can get the SQL server to do well for you
The logic behind this idea is that for most cases the database software does a better job at finding the most efficient way of completing your task than anyone possibly could writing code.
In other words - SQL is made for joining data, filtering data, selecting columns and so on. Doing these things in your own custom code instead of relying on SQL and the database software leads to writing unnecessary lines of code with no added value.
Here’s an example. Let’s say we need data to create a “California revenue Q3” report.
You can create this report by writing one line of SQL that magically:
Fetches users from the California table Sorts the data Totals the data Orders the data so you can show one column that says “California revenue Q3 2017”
This is what the one line of SQL would like look:
SELECT SUM(Value_USD) AS California_Revenue_Q3 FROM Transactions WHERE Location = 'California' AND DATEPART(q, Date) = 3 AND YEAR(Date) = 2017;
And if we wanted to break it down by location the SQL would be as follows:
SELECT Location, SUM(Value_USD) AS Revenue_Q3 FROM Transactions WHERE DATEPART(q, Date) = 3 AND YEAR(Date) = 2017 GROUP BY Location ORDER BY Location;
And if we wanted the top five areas by revenue:
SELECT TOP 5 Location, SUM(Value_USD) AS Revenue_Q3 FROM Transactions WHERE DATEPART(q, Date) = 3 AND YEAR(Date) = 2017 GROUP BY Location ORDER BY SUM(Value_USD) DESC;
To run these queries in other languages would be complicated, time-consuming, and take far too much code. SQL was purposely designed to slice data and it does it well. Not to mention that it’s more efficient to bring the computation to the data, rather than bringing the data to the computation.
Bring the computation to the data rather than bringing the data to the computation
8. SQL/RDBMS and NoSQL/DBMS databases play different roles
Databases are tools. They’re not all hammers. You have wrenches, screwdrivers, saws, spanners, etc. Each does a different job and solves a different problem. There are SQL, key value, time series, blockchains, embedded, and more. Each type of database is good at something and bad at others.
Relational databases are fantastic when you need to express relationships in a system when you can’t foresee all possible permutations of data combination, aggregation, or usage. And, honestly, most systems fall into this category. Plus the SQL language itself offers a user-friendly way to organize data in the way you need it.
SQL/RDBMS are just one of many tools for a specific job - and just so happen to be a perfectly feasible tool for many jobs. And when consistent data integrity is essential (for example, in finance), they are the best.
SQL databases have their drawbacks and aren’t the best choice for certain jobs. But for a vast majority of cases they simply blow every other NoSQL solution out of the water.
And if you’re going to get riled up about scale, realistically only a tiny percentage will ever need to worry about scaling a RDBMS - you’re not Facebook or Google. You can still have millions of users with a SQL database and have no issues.
Plus you can scale RDBMS, it just requires knowing the tradeoffs when you do.
For a more in-depth breakdown of the differences between RDBMS and DBMS check this this great Quora answer, which echoes much of what is said above.
Another 43 years?
Despite countless other database systems and technologies, and their growing popularity, SQL databases are undoubtedly here to stay - at least for the foreseeable future. With the advent of big data, deep learning, and IoT it wouldn’t be surprising if SQL databases thrive and expand for another 43 years.
SQL databases have their drawbacks, true. But for most use-cases the extensive community, simplicity of the language, and fundamental structure of RDBMS make it one of the better choices for database technology.
Why do we still use SQL after 43 years? Because it works and gets the job done 90% of the time. As developers with an increasingly complex landscape of technology and integrations, you can’t ask for much more than that.
This article has been updated to make a correction. The article originally stated that Edgar F. Codd developed both RDBMS and the SQL language. This is incorrect. Edgar F. Codd developed the RDBMS while Donald D. Chamberlin and Raymond F. Boyce came up with SQL. Unfortunately, as NoSQL humans, we don’t have ACID to ensure consistency between our notes and our word processor.