Faustian Pacts and Software Excellence

I saw this tweet recently and it sparked a few thoughts as it neatly gets to the heart of the biggest challenge in software – getting to done.

In my experience there is very often a tension on any sizable software project between the desire of the business to get the software in front of customers as quickly as possible and the engineering team’s desire to ensure the code and architecture is of a high enough quality to make refactoring, scaling, testing and addition of new features as easy as possible.  Both desires are equally valid.

The business needs to expand market share, ship code, have happy customers and get paid.  Possibly by being first to market.  Possibly because commitments and contracts have been signed promising a hard delivery date.

Equally developers understand the value of flexible architecture, code that is easy to refactor, test and debug.  Further developers know that without these things code becomes harder to work with, meaning quality is hard to ensure, which leads to bugs and delays in delivery and ultimately unhappy customers.

The tension comes from the assumption on both sides that the other should see their view as perfectly obvious and correct.  Both tribes are worrying about the same thing from a different view point.  Simplistically:

  1. The business worries that if the customer doesn’t get their software, we won’t get paid and we’ll be out of a job
  2. The dev team worries that if the software isn’t right we’ll be shipping slow buggy software, we won’t get paid and we’ll be out of a job.

It is in the middle of these two tribes and this argument that the senior/lead developer will find themselves.  And it is here that both sides have to communicate and trust each other.

Communication

At the start of the project we could agree (simplistically):

We must ship this project by June 2020, otherwise we lose the contract to our competitor.  We need to make some pragmatic decisions, take on a certain amount of technical debt.  Once we deliver we won’t promise any major new features until the technical debt is resolved.

Trust

Regarding trust.  The business needs to trust the team that they will deliver and make the necessary pragmatic decisions.  The developers need to trust that the business will allow them the necessary space to address the technical debt.

Profit

Which takes us back to the tweet.  Code that is profitable but with an amount of tech debt is acceptable as long as each side understands the reasons and the trust exists to resolve as/if needed.

Faust

Unfortunately in many software projects the above turns into a Faustian pact for the dev team.  Reluctantly they allow the technical debt to build up trusting that time will be given to resolve it.  In the meantime the business signs more deals, or demands new features with harder deadlines and the project becomes exactly what the dev team feared – unwieldy, slow and buggy, and stress increases.

Image result for faust

Excellent teams and businesses understand this, and ensure that both deadlines are hit and code quality is high, with time allowed to address technical debt.   In the end it’s only the excellent teams that will succeed.

 

Advertisement

Address Search OS OpenNames with PostGIS, SQLAlchemy and Python – PART 2

Part 1 of this post outlined how to configure a PostGIS database to allow us to run Full Text searches against the OS OpenNames dataset.

In Part 2 we look at writing a simple Python 3 CLI app that will show you how easy it is to integrate this powerful functionality into your apps and APIs.  Other than Python the only dependency we need is the  SQLAlchemy ORM to let our app communicate with Postgres.

address-search

Installing SQLAlchemy

SQLAlchemy can be installed using pip.  It is dependent on psycopg2, which you may struggle to install on Mac without Postgres present, which is frustrating (however solutions can be found on Stack Overflow)

A simple address search CLI


import argparse
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects.postgresql import TSVECTOR
from sqlalchemy.orm import sessionmaker
# Create DB Session
engine = create_engine('postgresql://iain:password@localhost:5432/Real-World')
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base()
class OpenNames(Base):
__tablename__ = 'open_names'
# Map DB columns we're interested in
ogc_fid = Column(Integer, primary_key=True)
text = Column(String)
textsearchable = Column(TSVECTOR)
def search_address(self, search_for: str):
print(search_for)
or_search = search_for.replace(' ', ' | ') # Append OR operator to every word searched
results = session.query(OpenNames.text).filter(OpenNames.textsearchable.match(or_search, postgresql_reconfig='english'))
for result in results:
print(result.text)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('address', help='Address you want to search for')
args = parser.parse_args()
open_names = OpenNames()
open_names.search_address(args.address)

Let me draw your attention to…

Hopefully this script is fairly easy to follow, but there are a couple of lines to draw your attention to

  • Line 4 – Note we have to tell SQLAlchemy we’re using the Postgres dialect so it understands TSVECTOR
  • Lines 8 – 12 is simply SQLAlchemy boiler plate that sets up our connection and session for the app.  You’ll need to swap out the connection details for your own
  • Lines 17-20 I’ve chosen to map only 3 columns, you’ll probably want to map more.
  • Line 25 – is very important, here we append the OR operator to every word the user has supplied, meaning we’re returning addresses.  You could extend this to allow the user to specify on exact match operator and change this to an & search.
  •  Line 26 – Finally note we ask SQLAlchemy to match our search, and importantly we must supply the postgresql_reconfig param to say we’re searching in English.  This is vital or you wont get the matches you expect.

Running our app

We can run our app from the command line simply by entering the following command

python address_search.py 'forth street'

And we see our app print out all matching addresses that contain either Forth or Street 🙂

Ends

Hopefully you can see how easy it would be take the above code and integrate it into your apps and APIs.  I hope you’ve found these tutorials useful.  Happy text searching.

5 lessons from 3 years at a start-up

Some thoughts in no particular order after 3 years at a start-up

Have a plan – sounds obvious but a weakness of agile is that it can give rise to the illusion that there’s a plan.  However, in reality planning is emergent as the iterations and stories float by.  Emergent planning means that the team can drift or can become distracted, or it’s hard to turn down non-core projects because you can’t point to a strategy or project delivery.  Plans can be flexible and tested in the MVP style, and changed when they are proved not to be working – but there’s no excuse not to have one.

Then ensure everyone is signed up to the plan.  Even in a small team it’s easy for factions and agendas to emerge.  Getting everyone pulling in the same direction is non-trivial

Sales and marketing are waaay more important than devs admit/realise
– Make time to support sales and marketing efforts.  Devs love to scoff at sales people with their suits, lines in BS and vague promises.  But the hard fact is there are very few successful products that have gained market share on technical superiority alone, and the chances are your team is not producing one of them.  You need to think long and hard about your sales and marketing approach.

Only today did I read in the Sunday Times that the publishers of Grand Theft Auto hired Max Clifford to create a media shit-storm regarding the moral failings of the game.  Resulting, of course, in millions of additional sales.

Avoid non-core projects at all costs – Pressure for sales may mean you’re tempted to take on side projects, or do free work in exchange for some kind of marketing exposure. DON’T!!  DON’T EVEN THINK ABOUT IT!!

My experience was that this was a huge distraction and money-pit and time waster and just a generally bad idea that should be pushed back against at all costs.  If you’re tempted and think you can manage it – trust me it will still be a distraction.  If you’re still tempted time-box the work hard and ensure all stakeholders understand that there’s a maximum amount of time you can afford.

Don’t white-label and abstract features until at least 2 customers ask for them – This is basically a rewording of YAGNI – it’s tempting to assume all customers will want feature X or Y.  However, until you have hard evidence that multiple customers want the same feature, try to avoid wasting time abstracting them.  This sounds simple but is very difficult to police and make hard/fast decisions about without getting devs backs up – kanban boards etc can help here to demonstrate to the team how these tasks can add time and cost to the project.

Invest in your team – This doesn’t just mean salaries, this mean listening to your employees.  If you notice the team doing a lot of overtime, do something about it.  Encourage R&D, make sure they have some “slack” time, pay for them to attend conferences, encourage them to blog, take them out for dinner.  Encourage experimentation with new technologies.  Let them do flexi-time, homeworking.

Things like this make a job enjoyable, and mean your team aren’t scouring the job ads.

So in conclusion, as usual, we can say the golden rule is that there are no golden rules, no doubt success can be achieved by ignoring all of the above, but these stuck out to me over the last few years.

See also:

The SDK business is dead – It’s a commodity market now.

The Mythical Version 1.0

As a breed us hackers are perfectionists.  Tinkering away at that algorithm, worrying about the size of that switch statement, wondering about abstracting away some detail.  But always, always with the aim of improving our code base.

Many of our number are also a bunch of nit-picking, passive-aggressive, show-boating arseholes.  Although these traits are kind of endearing once you realise that optimus1337, who is currently comparing you to Hitler, is probably 19, his Mum thinks he’s a wonderful lad, and he helps his Gran with her shopping at the weekends.

However, there is an unfortunate consequence of these two character traits.  It can make it very intimidating about putting out your opinion or sharing some code with your peers.  We’ll hoard code, or practice at home, but not want to put something out there because it’s not perfect, or we won’t contribute to a project for the fear that we’ll be shouted down, or what we produce won’t meet some sort of arbitrary ultra-geek standard.

This attitude can be seen in the insanely conservative version numbers we give any code that we are brave enough to put out into the wide-world, ie – MyProject – v0.0.001.  For example, I’m a massive fan of the Nant project and have been using it to build my solutions for the last 4 years.  In that time the project has gone from version 0.86Beta1 to the recently released v.091.  In the entire 4 years I’ve been using it, it’s been as solid as a rock, and I haven’t had one issue with it, ever!

There’s no such thing as done

All developers implicitly understand that no project is ever finished, or any piece of code ever completely bug free, or that couldn’t be refactored.  Which makes a “done” project as rare as the legendary unicorn.

A few years back when projects started versioning themselves after the year/month they were deployed, ie Ubuntu 12.4, Office 2010 etc.  I was very cynical, thinking this is just a marketing ploy, to make us download/purchase the latest version.

However, I’ve lately realised that this versioning scheme has the benefit of indicating that this software is just that year’s version, or that month’s version.  It doesn’t say this software has reached mythical v1 status, it just says this is the stuff we think is good enough to release now.  The marketing aspect is just a fringe benefit 🙂

Conclusion

So don’t worry about joining the melting pot – jump right in.  Release version 12.3.09 of that idea you’ve been working on.  You can still conform to semantic versioning, and tell the likes of optimus1337 “Dude, relax. The code’s not done, it just the stuff I wanted to share, and BTW that’s not how you spell Goebbels ;-)”

Update – Auto Packaging using CSPack and Azure SDK 1.6

This post is related to two of my previous posts:

Azure 1.5 ate my diagnostics

I had diagnotics working quite happily until SDK 1.5 came out.  Then all of a sudden data was no longer being transferred to  Azure storage.  Even more mysteriously diagnostics would happily transfer data to Azure storage when being emulated locally, but not when on the Azure cloud (in other words a nightmare problem)

I didn’t get around to investigating why till this week.  I saw that several people had the same problem, and assumed that the problem was that I wasn’t configuring the diagnostics correctly in the OnStart method.

Finally I saw this forum thread.  The thead described that if you upload your solution from visual studio diagnostic works correctly, but not when deployed from the build process.  I tried for myself, and yep diagnostics would magically work when the solution was deployed from Visual Studio.  This finally clued me into the fact that the problem had nothing to do with the code, but everything to do with packaging.  Which leads us to this update on Auto Packaging your Azure solution.

Configuring Your Azure Continuous Integration process with CSPack and SDK 1.6

My previous post on using CSPack to automatically build your deployment packages is largely still correct.  But as of (I assume SDK 1.5) there’s a new EntryPoint property.

So you need to specify the name of the DLL that is the entrypoint to your solution.  In mycase HuzuSocial.App.dll.  So my AzureProperties.txt file now looks like this:

TargetFrameWorkVersion=v4.0
EntryPoint=HuzuSocial.App.dll

Now configured correctly, Diagnostics works as expected from our Continuous Integration process.

Windows Azure Diagnostics with SDK 1.6 for WebRoles

There appears to be a lot of conflicting and confused advice about configuring Diagnostics on Windows Azure.  The situation is not at all helped by Microsoft’s own site which, to paraphrase Morecambe and Wise, has all the right pieces of information, just not necessarily in the right order.

It doesn’t help that what used to work with earlier versions of the Azure SDK, no longer works with later versions.  So here I outline:

  • The steps to get Diagnostics outputting correctly to Windows Azure Storage with SDK 1.6 for WebRoles (although I’d imagine it’s largely the same for WorkerRoles)
  • Azure 1.5 ate my diagnostics – Another post where I update my Auto Packaging post to be compatible with SDK 1.6

Setting up Windows Azure Diagnostics for your WebRole with SDK 1.6

1. Configure Web.Config – required if you are using Trace statements

I use Log4Net for my general logging/tracing needs so don’t use Trace statements, thus the example shown in step 3, below, does not require you to complete this step.

However, if you are using Trace statements,  ie:

System.Diagnostics.Trace.TraceError("Error has occurred");

You’ll need to configure Web.config as described here

<system.diagnostics>
    <trace>
        <listeners>
            <add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener,
                Microsoft.WindowsAzure.Diagnostics,
                Version=1.0.0.0,
                Culture=neutral,
                PublicKeyToken=31bf3856ad364e35"
                name="AzureDiagnostics">
                <filter type="" />
            </add>
        </listeners>
    </trace>
</system.diagnostics>

2. Initialise Diagnostics

As outlined here, you’ll need to ensure you add the Import element for the Diagnostics module in your ServiceDefinition.csdef file.  Here’s what mine looks like:

<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="HuzuSocial.Azure" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
    <WebRole name="HuzuSocial.App" vmsize="Small" >
        <Sites>
            <Site name="Web">
                <Bindings>
                    <Binding name="Endpoint1" endpointName="Endpoint1" />
                </Bindings>
            </Site>
        </Sites>
        <Endpoints>
            <InputEndpoint name="Endpoint1" protocol="http" port="80" />
        </Endpoints>
        <Imports>
            <Import moduleName="Diagnostics" />
        </Imports>
    </WebRole>
</ServiceDefinition>

Secondly you’ll need to add your Azure Storage Account details into your ServiceConfiguration.cscfg, mine looks like this (obviously replace with your account name and key):

<?xml version="1.0" encoding="utf-8"?>
<ServiceConfiguration serviceName="HuzuSocial.Azure" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">
        <Role name="HuzuSocial.App">
        <Instances count="2" />
        <ConfigurationSettings>
            <Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString" value="DefaultEndpointsProtocol=https;AccountName=[youracountnamehere];AccountKey=[youraccountkeyhere]/>
        </ConfigurationSettings>
        <Certificates>
        </Certificates>
    </Role>
</ServiceConfiguration>

3. Override the OnStart method in WebRole.cs

In the root of your web project you should have a WebRole class.  You’ll need to override the OnStart method to correctly initialise the Diagnostics.  There is loads of different sample code out there, some of it highly dubious.  This is my configuration, and works well for me (I lifted this from a post out there somewhere, unfortunately I forgot to bookmark it and can no longer find it, so thankyou whoever you are)

public override bool OnStart()
{
    DiagnosticMonitorConfiguration diagConfig = DiagnosticMonitor.GetDefaultInitialConfiguration();

    var perfCounters = new List<string>
    {
        @"\Processor(_Total)\% Processor Time",
        @"\Memory\Available Mbytes",
        @"\TCPv4\Connections Established",
        @"\ASP.NET Applications(__Total__)\Requests/Sec",
        @"\Network Interface(*)\Bytes Received/sec",
        @"\Network Interface(*)\Bytes Sent/sec"
    };

    // Add perf counters to configuration
    foreach (var counter in perfCounters)
    {
        var counterConfig = new PerformanceCounterConfiguration
                            {
                                CounterSpecifier = counter,
                                SampleRate = TimeSpan.FromSeconds(5)
                            };

        diagConfig.PerformanceCounters.DataSources.Add(counterConfig);
    }

    diagConfig.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);

    //Windows Event Logs
    diagConfig.WindowsEventLog.DataSources.Add("System!*");
    diagConfig.WindowsEventLog.DataSources.Add("Application!*");
    diagConfig.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);
    diagConfig.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning;

    //Azure Trace Logs
    diagConfig.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);
    diagConfig.Logs.ScheduledTransferLogLevelFilter = LogLevel.Warning;

    //Crash Dumps
    CrashDumps.EnableCollection(true);

    //IIS Logs
    diagConfig.Directories.ScheduledTransferPeriod = TimeSpan.FromMinutes(1.0);

    DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", diagConfig);

    return base.OnStart();
}

4. That’s it

When deployed to Azure your diagnostics should be successfully transferred to Azure Storage.  To analyse them in any meaningful way, I’d recommend Cerebrate Diagnostics manager, which gives you a nice dashboard.  See below