A rule that has suited me well is to take an estimate, double it, and increase by an order of magnitude for inexperienced developers.
So a task the say would take two weeks ends up being 4 months.
For experienced developers, halve the estimate and increase by an order of magnitude. So your 10 days estimate would be 5 weeks.
The biggest estimation effort I ever took part in was a whole-system rewrite[1] where we had a very detailed functional test plan describing everything the system should do. The other lead and I went down the list, estimated how long everything would take to re-implement, and came up with an estimate of 2 people, 9 months.
We knew that couldn't possibly be right, so we doubled the estimate and tripled the team, and ended up with 6 people for 18 months - which ended up being almost exactly right.
[1]: We were moving from a dead/dying framework and language onto a modern language and well-supported platform; I think we started out with about 1MLOC on the old system and ended up with about 700K on the rewrite.
10 days was already after I used this algorithm. Previous several tasks on that codebase were estimated pretty good. Problem with this is that some tasks can indeed take SEVERAL orders of magnitude more time that you thought.
One of the hardest problems with estimating for me is that I mostly do really new tasks that either no one wants to do because they are arduous, or no one knows how to do yet. Then I go and do them anyway. Sometimes on time, mostly not. But everybody working with me already knows, that it may be long, but I will achieve the result. And in rare instances other developers ask me how did I managed to find the bug so fast. This time I was doing something I have never before done in my life and I missed some code dependencies that needed changing when I was revieving how to do that task.
You have to be careful of the perceived politics around this. Tall poppies get cut down. I still don’t totally understand why but sometimes taking initiative doesn’t sit well with the folks who want their trains to run on time.
I think this varies from person-to-person or maybe organization-to-organization. I've definitely seen variations in the health of organizations but I think you can break up the categories used to judge them, e.g., meritocracy, overtime frequency, planning accuracy, psychological safety, value of work, etc.
I'd say the place I worked felt above average in meritocracy. In other words, it felt like folks sticking out to take initiative were more often rewarded than punished. I don't think we were perfect in every category though.
License plate numbers are generally considered PII in their own right. A tuple of make, model, color, and year range is getting awfully close to an equivalent on its own as well.
no they're not. PII has to be able to identify an individual.
anyone can in theory be driving a car. is it my wife, or me, or my kid taking the station wagon out this weekend?
it's also why red light cameras and speed camera send tickets to the registered owner, not necessarily who is driving. my sister in law borrows the car and I get the ticket
Generally "I wasn't driving then" is actually a defense to the automated cameras. The registered owner things is just the first pass like any other lazy investigation.
In the broader context PII is a looser concept, and can be thought of like browser fingerprinting. The legal system hasn't formalized it nearly to the same degree, but does have the concept of how enough otherwise public information sufficiently correlated can break into the realm of privacy violations. I. The browser fingerprinting world that's thought of pretty explicitly in terms of contributions of bits of entropy, but the legal system has pushed back on massive public surveillance when it steps into the realm of stalking or a firm of investigation that should require a warrant.
PII isn’t limited to SSNs. By your logic, First name can’t be PII, and last name with no accompanying info wouldn’t be PII. Different types of data have different risk profiles. When multiple records about an individual are collected the risk grows exponentially. Location is absolutely PII when combined with other risky data, like license plate.
Nearly all time series databases store single value aggregations (think p95) over a time period. A select few store actual serialized distributions (Atlas from Netflix, Apica IronDB, some bespoke implementations).
Latency tooling is sorely overlooked mostly because the good tooling is complex, and requires corresponding visualization tooling. Most of the vendors have some implementation of heat map or histogram visualization but either the math is wrong or the UI can’t handle a non trivial volume of samples. Unfortunately it’s been a race to the bottom for latency measurement tooling, with the users losing.
I take it as a given that what is stored and graphed is an information-destroying aggregate, but I think that aggregate is actually still useful + meaningful
reply