17 Sep 2024

Solving hard software problems

History / Edit / PDF / EPUB / BIB / 1 min read (~81 words)
Processes
  • For problems related to environment setup, trying to accomplish the same thing on two computers can help identify disparities caused by the environment and not the code itself.
  • If you are blocked when trying to solve your problem, define timebox how long you will continue attempting to address the problem before changing your approach.
    • Every half-hour/hour ask yourself "Have I made progress in the last period?" If the answer is no, define a time limit before reassessing your next steps
14 Apr 2022

Incident post-mortem

History / Edit / PDF / EPUB / BIB / 1 min read (~43 words)
Processes
  • Identify the cause of the problem
    • 5 whys
  • List potential solutions
  • Investigate potential sources of similar problems
  • Address the additional sources of risk

  • Reduce incident duration
    • Identify the cause of the problem more rapidly
  • Reduce incident cost
  • Reduce the number of people involved
14 Apr 2022

Incident investigation

History / Edit / PDF / EPUB / BIB / 2 min read (~217 words)
Processes
  • Define the incident owner
  • Define the incident secretary/communicator
  • Create and document
    • Summary
    • Observations (link to metrics dashboards with absolute timestamps as much as possible)
      • Screenshots
        • Who took the screenshot
        • Link to get the graph/data
        • Associated conclusions
      • Links to logs
    • Hypotheses/theories
      • Who made them
      • When
      • If they have been validated/invalidated
    • The actions taken
      • By whom
      • If it had the desired effect
    • etc.
  • In the situation where an incident has been caused by the introduction of a code regression, revert the change and deploy as soon as possible
  • Start by reducing/relieving the impact of the incident before searching for a root cause
  • Use multiple data sources when data sources do not agree
  • Diagram all the implicated systems and the relationship to one another in order to identify the potential locations where the problem might be
  • Test your hypotheses to verify if they hold or not
  • Develop a procedure over time that can be followed to diagnose similar issues
  • Write down a list of improvement suggestions in order for the incident not to reproduce itself in the future or to lessen its impact

  • Once the incident is completed, have a summary of the conclusions at the top of the document with a link to the sections in the document explaining the rationale behind the conclusions
03 Jan 2022

Yearly review

History / Edit / PDF / EPUB / BIB / 1 min read (~69 words)
Processes

Every year, either at the beginning or end of the year.

A few hours spread over the course of a few days.

  • Review year according to various facets
  • Plan the next year according to the same facets

  • I use a mind map software to do my yearly review and plan.

03 Jan 2022

Monthly review

History / Edit / PDF / EPUB / BIB / 1 min read (~54 words)
Processes

Every month, either at the beginning or end of the month.

30 minutes.

  • I use a text editor, such as VS Code, to write my monthly reviews.