DevOps Patterns & Antipatterns for Continuous Software Updates “What can possibly go wrong?!”

Why software updates? @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Update time Update time Exploit A reported Attack using exploit B Exploit A patched Vulnerability discovered Exploit B discovered Exploit B reported Attack using exploit A Exploit B patched Vulnerability patched Exploit A discovered @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

“As every company become a software company, Security vulnerabilities are the new oil spills” @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Exploit A reported Attack using exploit B Exploit A patched Vulnerability discovered Update time Update time Update time Exploit B discovered Exploit B reported Attack using exploit A Attack using exploit C Exploit B patched Exploit C patched Exploit C discovered Exploit C reported Exploit A discovered @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

🎩 @jbaruch #dockercon jfrog.com/shownotes @ErinMeyerINSEAD’s “Culture Map”

shownotes http://jfrog.com/shownotes Slides Video Links Comments, Ratings Raffle @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

So, you wanted to update faster… @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Update available Yes No Do we trust the update? Yes How about no Let’s update! Yes Are there any high risks? No Do we want it? No

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Today IoT Serverless Docker Microservices Infrastructure as Code Continuous Delivery Continuous Integration Agile 2000 @jbaruch @jfrog #LiquidSoftware www.liquidsoftware.com

#emptyenvelopefromchina @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Update available Yes No Can we verify the update? No Yes Yes How about no Do we trust the update? Time consuming verification Let’s update! Yes Are there any high risks? No Do we want it? No

Features that we want @jbaruch Acceptance tests costs #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Your browser Twitter in your browser Twitter on your smartphone Your smartphone OS?! Update available Yes Are there any high risks? No Let’s update! Do we want it? No one asked you (auto update)

What can possibly go wrong?

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: Local rollback @jbaruch #DevOpsWorld Problem: update went catastrophically wrong and an over the-air patch can’t reach the device Solution: Have a previous version saved on the device prior to update. Rollback in case problem occurred #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: OTA software updates @jbaruch #DevOpsWorld Problem: physical recalls are costly. Extremely costly. Solution: Implement over the air software updates, preferably, continuous updates. #LiquidSoftware http://jfrog.com/shownotes

continuous OTA updates are like normal OTA updates, but better @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: continuous updates @jbaruch #DevOpsWorld Problem: In batch updates important features wait for non-important features. Solution: Implement continuous updates. #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Nub’s horror New feature update Uses templating with $ symbol Apple’s staging servers return prices without $ symbol Some Apple’s production servers return prices with $ symbol As a result, some users suffer crashes It took time to understand what went wrong It took time to get the fix through Apple review @jbaruch #LiquidSoftware #DevOpsWorld http://jfrog.com/shownotes

Continuous updates pattern: Canary releases @jbaruch #DevOpsWorld Problem: Releasing a bug affects ALL the users. Solution: Release to a small number of users first and observe. If a problem occurs, stop the release, revert or update the affected users. #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: observability @jbaruch #DevOpsWorld Problem: Some problems are hard to trace relying on user feedback only Solution: Implement tracing, monitoring and logging #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: Rollbacks @jbaruch #DevOpsWorld Problem: Fixes might take time, users suffer in a meanwhile Solution: Implement rollback, the ability to deploy a previous version without delay #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: feature flags @jbaruch #DevOpsWorld Problem: Rollbacks are not always supported by the deployment target platform Solution: Embed 2 versions of the features in the app itself and trigger them with API calls #LiquidSoftware http://jfrog.com/shownotes

You thought your problems are hard? Things under your control @jbaruch #DevOpsWorld Server-side Updates #LiquidSoftware IoT (Mobile, Automotive, Edge) Updates http://jfrog.com/shownotes

You thought your problems are hard? Things under your control Server-side Updates IoT (Mobile, Automotive, Edge) Updates ✓ ✕ The availability of the target @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

You thought your problems are hard? Things under your control Server-side Updates IoT (Mobile, Automotive, Edge) Updates ✓ ✓ ✕ ✕ The availability of the target The state of the target @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

You thought your problems are hard? Things under your control Server-side Updates IoT (Mobile, Automotive, Edge) Updates ✓ ✓ ✓ ✕ ✕ ✕ The availability of the target The state of the target The version on the target @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

You thought your problems are hard? Things under your control Server-side Updates IoT (Mobile, Automotive, Edge) Updates ✓ ✓ ✓ ✓ ✕ ✕ ✕ ✕ The availability of the target The state of the target The version on the target The access to the target @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

KNIGHT-MARE New system reused old APIs 1 out of 8 servers was not updated New clients sent requests to machine contained old code Engineers undeployed working code from updated servers, increasing the load on the not-updated server No monitoring, no alerting, no debugging @jbaruch #LiquidSoftware #DevOpsWorld http://jfrog.com/shownotes

Continuous updates pattern: Automated deployment @jbaruch #DevOpsWorld Problem: People suck at repetitive tasks. Solution: Automate everything. #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: frequent updates @jbaruch #DevOpsWorld Problem: Seldom deployments generate anxiety and stress, leading to errors. Solution: Update frequently to develop skill and habit. #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: state awareness @jbaruch #DevOpsWorld Problem: Target state can affect the update process and the behavior of the system after the update. Solution: Know and consider target state when updating. Reverting might require revering the state. #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Real life pattern: be kind @jbaruch #DevOpsWorld Problem: You shame someone publicly; week later shit happens to you. Solution: Don’t be a shmuck. #LiquidSoftware http://jfrog.com/shownotes

Cloud-dark @jbaruch #DevOpsWorld New rules are deployed frequently to battle attacks Deployment of a single misconfigured rule Included regex to spike CPU to 100% “Affected region: Earth” #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: Canary releases @jbaruch #DevOpsWorld Problem: Releasing a bug affects ALL the users. Solution: Release to a small number of users first and observe. If a problem occurs, stop the release, revert or update the affected users. #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Continuous updates pattern: zero downtime updates @jbaruch #DevOpsWorld Problem: You will probably loose all your users if you shut down for 5 weeks (and counting) to perform an update. Solution: Perform zerodowntime OTA small and fequent continuous updates. #LiquidSoftware http://jfrog.com/shownotes

Continuous updates @jbaruch #DevOpsWorld Frequent Automatic Tested Canary State-aware Observability *Local Rollbacks #LiquidSoftware http://jfrog.com/shownotes

Update available Yes Do we trust the update? Yes Do we want it? Are there any high risks? Sure, why not? (auto update) Yes Let’s update! No

” Our goal is to transition from bulk and rare software updates to extremely tiny and extremely frequent software updates; so tiny and so frequent that they provide an illusion of software flowing from development to the update target. We call it the Liquid Software vision. @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

@jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Corner cases? @jbaruch #DevOpsWorld #LiquidSoftware http://jfrog.com/shownotes

Q&A and twitter ads @jbaruch #DevOpsWorld #LiquidSoftware http://liquidsoftware.com http://jfrog.com/shownotes