But there are strategies, best practices if you will, which can make the transition easier. One I thought worth mentioning is ESLint. This is a process whereby you statically validate your code to make sure it adheres to a set of rules you define. These rules can be as lax or strict as you like, but, applied properly they make your code more supportable, more readable and less error-prone.
So, how to get started I hear you ask? A lot depends on your working environment. If you use a modern editor like VS Code, Atom or Sublime Text then there will be a plugin you can install to help you along. In your project, you add a
.eslintrc file which defines the rules you want to apply and then you look, with horror, at the hundreds of errors that have suddenly appeared in your code.
But fear not, things aren’t as bad as they appear. Your ESLint plugin will probably have a command to automatically fix “easily fixable issues” such as using single quotes rather than double quotes for strings, or indentation of your code. Once these are all addressed you can fix the rest of the issues manually, and relatively quickly.
I’ve recently completed this exercise with a large code base primarily because we want to put some focus on performance: if I have code that is predictably organised, then this makes the process a lot easier.
A great side effect is that the rest of the developers on the team can pick up the code I’ve been butchering and immediately have some familiarity with the style.
The most popular eslint configuration is published by AirBnB as an open source project. It’s a great starting point, but the beauty of these configurations is that if you don’t like a rule then you can simply turn it off. Or, on a case by case basis you can also turn off the rule in your code. As an example, here is the LDC Via
Linting your code is not, in itself going to make it great code, but it will be a big help to you and, at the end of the day you can go home with a slightly smug feeling that you’ve gone that extra step to be a good programmer. And who doesn’t like to feel a bit smug sometimes?
The fourth and final post in an occasional series looking at each of our preferred code editors.
A lot has been said, written, and ranted, about programming editors and IDEs over the years. Allow us to add to the noise.
It’s Julian’s turn to finish this series by talking about his choice, Sublime Text…
Before I dive into my choice, let’s have a quick recap of my colleagues’ well-considered picks.
Ben works in Atom. He likes it because it’s clean, flexible, open-source, free, built on top of Electron, and has a huge variety of packages available to allow him to tailor it to his preferences and needs.
Matt works in Visual Studio Code. Despite the “Visual Studio” name, this is a lightweight editor, which he now prefers to Atom, not least for its top-notch integration with things like Git. Like Atom, it’s built on Electron and has a good selection of plug-ins to allow him to customise it.
Mark works in WebStorm. This is the lightweight companion product to IntelliJ IDEA, which he likes for its approach to source code control, amongst other things.
As for me, I work in Sublime Text. Why? Because [a] it’s installed and [b] I’m lazy.
A recent customer requirement was to have a process to import a list of data from CSV and incrementally update an existing table of the same data, adding/updating/deleting rows as required in order to get the two into sync. The production table has extra columns, with the CSV import being simply a list of (for argument’s sake) customer IDs and names. The technology here is .NET and SQL-Server.
In the past we might have used a process like this:
- Import the data into a temporary table
- Use a query to delete/archive any production data rows not present in the new imported list
- Use a query to update any existing production data rows that are different in the imported list
- Use a query to insert any new rows from the imported list into the production data
- Present the results (“[n1] rows added, [n2] rows updated and [n3] rows deleted”) to the UI and/or system logs.
All well and good, and it’s a safe and established way of going about it. However… enter the SQL
MERGE command, introduced in SQL-Server 2008. This allows us to perform all of the above data operations in a single SQL statement. Credit to this article (from 2014 - never tell me I’m not bang up to date!) for giving the bare bones on how to go about it, and that is definitely recommended reading for a deeper dive, as is the Microsoft documentation on MERGE. However, for a brief introduction, here is the syntax we’ve used:
CREATE PROCEDURE [dbo].[mergeCustomers]
@tblNewCustomerList CustomerType READONLY
MERGE INTO Customers
USING @tblNewCustomerList NewCustomerList
ON Customers.CustomerID = NewCustomerList.CustomerID
/* For an existing row we just update the name details */
WHEN MATCHED THEN
UPDATE SET Customers.FirstName = NewCustomerList.FirstName,
Customers.LastName = NewCustomerList.LastName
/* For a new row, insert the relevant data */
WHEN NOT MATCHED BY TARGET THEN
INSERT (CustomerID, FirstName, LastName)
VALUES (NewCustomerList.CustomerID, NewCustomerList.FirstName, NewCustomerList.LastName)
/* Row is in the data but not in the upload - delete it */
WHEN NOT MATCHED BY SOURCE
/* Now return the number of rows affected */
The first question this raises is: what’s that
@tblNewCustomerList parameter? The answer is: it’s a user-defined table TYPE, which allows us to pass an entire “table” of data into the stored procedure. In .NET/C# terms this means we create an in-memory
DataTable object, populate it with the contents of the CSV file, and then pass it to the stored procedure. In our case, data lengths and other validation is done in code, in order to ensure (as far as possible) that the data is compliant with any NULL rules or SQL field lengths before being passed, so that SQL won’t reject the entire update.
The C# code is of this form:
DataTable dt = new DataTable();
int rows affected = 0;
// Then repeated chunks of this sort of thing to add all our data to the datatable
row = dt.NewRow()
row[“CustomerID”] = 1;
row[“FirstName”] = “Julian”;
row[“LastName”] = “Woodward”;
// Then when we’ve finished constructing the data
string connString = [insert your SQL connection string here];
using (SqlConnection conn = new SqlConnection(connString))
SqlParameter param = new SqlParameter("@tblNewCustomerList", SqlDbType.Structured)
TypeName = "dbo.CustomerType",
Value = dt
using (SqlCommand cmd = new SqlCommand(“dbo.mergeCustomers", conn))
cmd.CommandType = CommandType.StoredProcedure;
rowsaffected = (int)cmd.ExecuteScalar();
The last piece of the jigsaw, then is this TYPE thing: how do we create that? The answer: go into (I’m using Visual Studio 2015) the “SQL Server Object Explorer” tool (via the [View] menu if it’s not already visible); and open the twisties in the order
Databases -> [your database] -> Programmability -> Types -> User-Defined Table Types
Right-click to add a new one, and then define it pretty much as you would a new table. For instance:
CREATE TYPE [dbo].[CustomerType] AS TABLE
[CustomerID] INT NOT NULL,
[FirstName] NVARCHAR (50) NULL
[LastName] NVARCHAR (50) NULL
That’s all folks!
Or… is it?
Here’s the thing. That
MERGE command is enormously powerful, but the information you get back from it is extremely limited:
@@ROWCOUNT will tell you the number of rows it touched. But remember our original pattern, which allowed us to feed back to the to user the individual counts of rows added/updated/deleted? Let’s say we need to find a way to do that.
The first thing to fix is that at the moment we are touching ALL existing rows with the FirstName and LastName, even if they are in fact the same, so let’s change that so that we’re only touching the rows we need to touch. That’s actually a simple change to the piece of syntax that deals with the case when we have a match, replacing
WHEN MATCHED THEN with
This now tells us the real number of necessary updates, in
@@ROWCOUNT, but still doesn’t give us the separate counts of updates, insertions and deletions that we desire.
One approach might be to have a separate stored procedure which compared the new and existing data and built some reports on what the
MERGE command will do when it’s run. But that’s potentially error-prone, particularly if there’s a chance the data will change in between the two SQL statements being executed.
A better approach involves changing the design of the table slightly to add an extra two columns:
MergeBatchID - char(36) - NULL
MergeAction - char(1) - NULL
Within the SQL procedure we can generate a batch ID as a GUID, thus:
DECLARE @batchID AS char(36)
SET @batchID = NEWID()
and then the updates and inserts can set it appropriately:
UPDATE SET MergeBatchID = @batchID,
MergeAction = ‘U'
Customers.FirstName = NewCustomerList.FirstName,
Customers.LastName = NewCustomerList.LastName and
INSERT (MergeBatchID, MergeAction, CustomerID, FirstName, LastName)
VALUES (@batchID, 'C', NewCustomerList.CustomerID, NewCustomerList.FirstName, NewCustomerList.LastName)
Using this we can then have some further SQL within the same stored procedure to build the detail we need. Possibly this would be more efficient done as a
GROUP BY query, but for our purposes we’re happy with a couple of quick-running
COUNT(*) type queries. Of course, we can’t get the deleted number from the data, because the relevant data has been, um, deleted. But we can derive it from the total returned by
@@ROWCOUNT and the Updates/Creates numbers we’ve already extracted:
DECLARE @Total as int
DECLARE @Creates as int, @Updates as int
SELECT @Total = @@ROWCOUNT
SELECT @Creates = Count(*) FROM ClientCompany cc WHERE MergeBatchID=@BatchID AND MergeAction='C'
SELECT @Updates = Count(*) FROM ClientCompany cc WHERE MergeBatchID=@BatchID AND MergeAction='U'
SELECT @Creates as CreatedCount, @Updates as UpdatedCount,
@Total-(@Creates+@Updates) as DeletedCount, @Total as TotalCount
Naturally, the C# code will also need to modified to read the results. But, other than that, there we have it: the
MERGE syntax, with full user-friendly stats.
In our production example, which isn’t so different from this, an incremental update of about 10,000 rows of imported CSV takes 1-2 seconds.
Nothing in the world of programming ever stays still, and, as time moves on things only change faster and faster it seems.
In our world we do a lot of node.js development and we’ve recently been taking stock of where we are.
As a rule of thumb, you want to be running on even number “LTS” (Long Term Support) versions of node.js. As of today (July 2017), version 6 is where most people should be and then in October the default will switch to version 8. The transparency is great, but the change is relentless.
Some legacy code was, until recently running on the node.js 4.x codestream. This is still supported, but it won’t be after April 2018, so it was time to bring it up to date, and a recent security vulnerability was the perfect opportunity to upgrade the codebase.
It’s when you dig back into older code like that, that you realise how much the style of coding has changed in even just the last 3 or 4 years.
Back then, the default way of working with network calls was
callbacks. When your code is nice and simple, callbacks make a lot of sense and they’re pretty easy to manage. But imagine working in our world where everything is based around network calls to our API, things quickly become hard to manage.
And this is where
promises entered. Our internal standard is to use a package called bluebird which is a really pleasant way of working. Increasingly you’re able to make use of native promises in node.js as more and more support for ES6 is added, but we stick with Bluebird for the moment.
But of course, nothing stays the same. With the advent of node.js 8.x we have support for
async/await. As ever, the grass is always greener, but even from a syntax point of view it looks cleaner, and when you’re dealing with large code bases, that’s an important consideration.
So, for the moment, we’re sticking with node.js 6.11.1, but paying close attention to releases as they come from the node.js foundation.
Over the years books have been written, blog posts too, all about that mythical “work-life balance” thing, and how one maintains an even keel whilst delivering on ever-growing deadlines and requirements.
How do the most productive people get things done? Well, I think it comes down to some simple, no-nonsense tips which build upon our post about “getting started” from a few weeks back:
There are 35 - 40 working hours in a week (unless you’re Mark). Consider reducing that number still further — say to 32, working four eight hour days and then stopping. This really focusses the mind. Funnily enough, when you hurl days, nights and weekends at something, you don’t necessarily get things done any quicker. An hour of laser-sharp focus trumps a day of faffing at the keyboard every time.
Keep it simple
I’ve run a company for a decade now, and since the start have used this baroque piece of accounting software that did everything: payroll, stock… all sorts. Didn’t need 80% of it. Recently I switched to a multi-tab spreadsheet for my accounts instead. Financial stuff gets done in a fraction of the time it took me to plug in my numbers before, and my VAT return details just fall out too: wonderful!
A spreadsheet — no apps, no monthly cloudy subscriptions — just one file for one job.
Plan away from the screen
It’s so easy to sit down at your editor of choice and hack away. Given that many of us are stationery geeks, why not simply grab a favourite pen and notebook, really think about what it is you want to do, and plan. When you reckon you’ve done enough thinking and planning, you probably haven’t. Ponder, gaze out of the window, go for a walk, and after all that write down some more. Then sit at your screen and get your work done.