Of Football and MinIO

Coming off the Super Bowl last weekend, it seemed apropos that I write this blog entry. Because I want to talk about a football.

You know the Peanuts bit, right? Charlie Brown lines up to kick the football. Lucy promises not to move it. Charlie Brown sprints and commits his whole body. Then Lucy yanks it away. Charlie is left gasping flat on his back, embarrassed, expectations shattered.

That is the thirty-year-long open-source trauma loop I’ve observed in a single gag. MinIO is the open-source community’s latest Lucy in February 2026.

As of today, the minio/minio repo is read-only. It’s archived. The README says the repository is no longer maintained. That the Community Edition is “source code only” with no precompiled releases. Security fixes handled “on a case by case basis.”

For former users or contributors, that can feel like Lucy, holding the football above her head, smiling.

I have thoughts.

Lucy Is People

I met with several folks from the company about half a decade ago. Sincere, bright, enthusiastic. They wanted us in their ecosystem. My coworker and I had found the project on GitHub, tried it, and it solved real problems our other options did not.

But there was a clear warning sign by late 2019. Operator, console, KES, and Sidekick were committed with AGPL to their repo. More of that seemed to be coming. But the AGPL license wasn’t allowed where I worked.

Lucy, Linus, and Licenses

A little history is in order.

In May 2021, after 18 months of increasing AGPL-licensed contributions from the company to the repo? MinIO finally relicensed its server from Apache 2.0 to AGPLv3. They framed this as a move “from Open Source to Free and Open Source.” They said it had become “very difficult to avoid the AGPL dependency for any reasonable production environment.”

Not that AGPL is evil! It is designed to close the “we run it as a service so we never share changes” loophole.

But it felt bad for contributors because of what it meant in this particular sequence. The maintained branch moved to AGPL. The old Apache branch began to rot. Your “choice” became: “accept the new terms and risk your IP”, “fork and maintain it”, or “pay for a commercial license.”

Obligatory I-Am-Not-A-Lawyer. How far AGPL copyleft actually reaches depends on what counts as a “derivative work” and how you integrate the software. Talk to an actual lawyer if you care.

I’m putting a pin in the phrase: “derivative work.” It is doing more heavy lifting in 2026, I think, than anyone realized in 2021.

Regardless, “AGPL in the blast radius” creates a challenging compliance burden at many companies. OSS Legal review overhead. A nagging “what if they change the deal again?” anxiety tax. And often an automatic rejection.

Anyway, this is where I found Linus Torvalds’ framing is useful. He chose GPLv2-only for the Linux kernel. He did not want to be at the mercy of someone else’s future licensing decisions. And in particular, he resented someone else changing the license terms under the code he wrote.

That Youtube video resonated with me when I stopped being a MinIO cheerleader in late 2019. They had lured me in with one license, I’d modified the code and followed the license. And at the moment it started getting good? That football was yanked away.

Financially, MinIO eventually closed a $103M Series B at a $1B valuation, led by Intel Capital and SoftBank Vision Fund 2. At the time the company had fewer than 45 employees. The adoption metrics: 762 million Docker pulls, the GitHub stars, the Fortune 500 penetration: those were the asset. The license change was how you monetize the asset.

Classic dual-licensing play. It wasn’t a new phenomenon. Not morally or legally wrong. But remember what it depends on: the copyleft has to be enforceable. Hold that thought.

Charlie Brown Meets The Ground

Post-2021-relicense, MinIO went public with license-violation accusations against companies that had embedded MinIO under the original Apache 2.0 terms.

Nutanix (July 2022): MinIO accused Nutanix of distributing the MinIO binary without attribution and claimed to be “terminating and revoking” licenses under both Apache v2 and AGPL v3. Nutanix eventually acknowledged “inadvertent omissions” and later removed MinIO entirely.

Weka (March 2023): same playbook, without even prior private contact. Weka pushed back, arguing that Apache 2.0 is explicitly irrevocable (Section 2). Darren Shepherd, chief architect at Acorn Labs, put it bluntly: “The optics of this for MinIO are just bad, whether it is justified or not. I don’t even get how one can revoke a license.”

Maybe those targets deserved it. Maybe they didn’t. Attribution compliance matters. But the enforcement pattern? Public accusations, trying to revoke rights without a legal basis, warnings of financial injury to downstream customers of those whom they had accused? Ouch. Talk about a way to eat your cache of consumer goodwill. This was the moment many realized using MinIO was a mistake a business might pay dearly for later.

Lucy Feels Personal

This story is not exactly abstract for me.

Around the time of the AGPL transition, I was part of a project that had been experimenting with MinIO. I became intimately familiar with the source code. And once the license began to shift, I realized: anything I built might be tainted by my knowledge of their codebase.

The mere fact that I had read their code, understood their architecture, internalized their patterns, extended and modified it under Apache 2.0 to suit my purposes? That was enough to create a contamination risk. At least in my mind. Not a certainty, but definitely a risk, and one my employer could not afford.

So the project I was working on pivoted. And I pivoted away from it. I stopped writing that kind of code and started managing people who wrote code instead. I succumbed to the pressure to become a middle manager instead of a creator. It was a nice run for five years: I built a high-performing team. And we created a great product that’s heavily used to this day (or so I hear).

Then I got really sick, spent months away from work to recover, realized that managing was no longer what I wanted to do, and less than a business day after leaving that company I had an offer that more closely aligned with my goals.

Now to be realistic? MinIO’s Lucy-like licensing leaks to AGPL was not the dominant factor in my decision. It was maybe… like item #17 on my spreadsheet of career considerations. But it was on the spreadsheet. Real licensing decisions affect real human careers.

Lucy And The Football

MinIO was changing the rules while pulling practical usability out of the community path, faster and faster. In March 2024, MinIO introduced Enterprise Object Store (later rebranded AIStor), drawing an explicit line between the community edition and the commercial product. Around May 2025, MinIO removed the administrative web UI from the Community Edition console. Cofounder Harshavardhana explained it as a maintainability and security issue and told users: for UI-based admin, move to AIStor or use the mc CLI. When asked if it would come back: no plans. By late October 2025, the Community Edition shifted to source-only distribution. No maintained binaries. No official container images. Build it yourself. And as of February 12, 2026 – last night – the repo README indicated “THIS REPOSITORY IS NO LONGER MAINTAINED.”

I mourned a little bit. I’d spent a lot of time with that code many years ago. It felt like a little GitHub funeral.

It’s Not Really About Charlie. Or Lucy. Or Linus. Or even MinIO.

When I first drafted this post, I was stuck in late-2010s thinking. I wrote it like a straightforward open-source rug-pull: build adoption on permissive terms, accumulate switching costs, change the deal. Lucy pulls the football, same as always.

But sitting here in 2026, I realized that presenting it that way would colossally miss the point.

So here is my new working thesis: open-source dual-licensing depends on copyleft enforceability. AGPL’s value as a monetization tool requires proving “derivative work”: tracing the chain from source to product, demonstrating that your code derived from their code. The commercial license is the escape hatch: pay us and we remove the AGPL obligations. Sue those who don’t comply, as MinIO did. That business model works as long as the derivation chain is traceable.

But because of large language models, it’s becoming untraceable.

If an LLM trained on MinIO’s codebase, plus Ceph, plus every distributed systems paper ever published, generates functionally equivalent S3-compatible object storage: is that a “derivative work”? Courts have not ruled. The legal theory is unsettled. And in practice, no one is checking.

Compare it to music. Warner Music Group threatened to sue Suno for training on their catalog, then settled for equity and licensing rights. The difference there is kinda’ important: Music provenance is often traceable. There are distribution logs, streaming records, identifiable melodies, and never-ending Bittorrent IP addresses of people downloading “free music”. WMG could prove their catalog was ingested by Suno employees, apparently (though all the negotiations were behind closed doors, so nobody really knows if they weren’t In The Room Where It Happened…)

Code doesn’t work like that. Imagine a developer vibe-codes an S3-compatible object store in 2026. The AI that helped was trained on a GitHub snapshot from 2019 that included MinIO under Apache 2.0. How do you prove derivation? The code has no watermark. There is no real distribution log, no throat to choke. The code was freely available to download from GitHub.

Good luck proving where that code snippet came from.

Combine that with current US policy to rescind regulations that might hinder AI innovation and the resultant chilling effect on lawsuits alleging improper sourcing of training data? Good luck proving GPLv3 or AGPL infringement for a vibe-coded closed-source enterprise app that just happens to smell like that thing you wrote back in 2014.

(Aside: Executive Orders have gotten out of hand. We should probably call a Convention to do something about that.)

Now back to my foreshadowing earlier. Remember my career pivot? I was so concerned about simple knowledge of MinIO’s codebase infecting the code I was building that I stopped building. Between 2019 and 2021, that felt like a reasonable precaution.

In 2026, that idea seems almost quaint.

Software engineers who’ve adopted AI now are using it to write most or all of their code. They mainly work toward a coherent higher-level result. The distinction of “where that code came from” has lost much relevance. Future-me would not have worried about code contamination: I would have thought of a feature, carefully outlined it in planning docs, AI would write it to my specs, and I would have tested it and submitted the PR. The contamination anxiety that partly drove my career change half a decade ago is dissolving in a world where everyone’s code is a slurry blended from everything else.

That is … well, kind of an upside for me personally. I am 100% loving my AI-driven coding workflows. I get better and faster and more accurate at it every day. It helps me get my head out of the weeds of pure process-driven thought, and into shipping actually working inventions.

But I suspect this poses an existential threat to every company whose business model depends on the opposite being true. On strong provenance guarantees and provable license violations.

Including basically the whole open-source community.

Anthropic just announced that their agents autonomously built a working C compiler. I strongly suspect that a very careful source code audit would find most of the functions in that code base had some very-similar function from other open-source projects, under a license that’s unfriendly to closed-source businesses.

And there’s probably nothing the creator and licensor of that code can do about it.

So: why write and open-source something if it is just fodder for AI to train on with no accountability? Why not vibe-code it yourself and figure out how to monetize it, even if chunks of the training corpus were AGPL-licensed? If US law continues to treat AI training as fair use, the copyleft enforcement mechanism – the very thing that makes AGPL valuable as a dual-licensing tool – becomes legally unenforceable.

The football is not just being pulled away.

There is no football.

Lucy Is a Marionette

Which brings us back to MinIO specifically, and why I think the acceleration makes more sense than “they just got greedy.”

MinIO’s business model was a dual-licensing model: the AGPL community edition creates the compliance burden, and AIStor is the relief valve. That model requires AGPL to be scary enough that enterprises pay to avoid it.

If AI-generated code makes AGPL obligations unenforceable in practice, the value of that commercial license declines. The extraction window is closing. SoftBank Vision Fund 2 put money in at a $1B valuation. They need a return before the window shuts.

I want to be a little bit careful here, because what follows is a testable hypothesis, not a proven fact.

That said, closing the repo accomplishes several things at once. It stops new commits from entering AI training corpora. It prevents external eyes on code quality. There are no public commits. No community-filed source-based CVEs. No independent code security audits. No one diffing commits and noticing the code smells like AI instead of Harshavardhana anymore.

This change, I think, gives MinIO some flexibility. It can reduce headcount, rely more heavily on AI-assisted development, and shift engineering resources away from or into development internally without anyone noticing.

From that point of view? OPACITY IS THE FEATURE.

There’s some evidence to support my speculation. Glassdoor reviews from MinIO employees say: “Rapid hiring to meet financial goals has been followed by layoffs framed as role eliminations, leaving employees uncertain about their future.” They describe no performance reviews or pay adjustments for 2.5+ years. They say the company “struggles to turn [open source] into a sustainable business.” One reviewer describes leadership as “disorganized” with “frequent mixed signals.”

The headcount data is contradictory, which is itself an interesting data point. Blocks & Files reported “fewer than 45 people” at the January 2022 Series B. PitchBook currently says 195. Tracxn says 74 as of December 2024. Even taking the most generous reading, MinIO hired aggressively post-Series B and now the numbers are murkier. And MinIO claims 149% ARR growth and a spot on the 2025 Deloitte Fast 500, while employees complain about stagnant pay, layoffs, and a struggle to monetize.

The strings on the marionette lead to a cap table that seems like it’s probably running out of time, and the murky numbers make the motivations unclear. Draw your own conclusions about who – or, I suppose, “what” – might be driving the desperation.

Game Over

MinIO is not the first Lucy. Oracle killed OpenSolaris after acquiring Sun. MongoDB switched from Apache to SSPL in 2018. Elastic moved Elasticsearch away from Apache 2.0. Redis went source-available in 2024. Different details, same Charlie Brown physics: build adoption under permissive terms, accumulate switching costs, change the deal, leave your contributors laying on their backs, staring at the sky, and wondering what they’ve done wrong to deserve this.

But this time is a little different. Every one of those prior rug-pulls happened in a world where “derivative work” still meant something enforceable. You could look at the code and trace it, and often observe how a hardware or software device or service behaved to determine if it was using your code. The license regime assumed a human wrote the code and you could follow the provenance.

That world has abruptly ended. Now AI is writing most of the code. And every VC-backed company that bet on dual-licensing is facing the same fast-closing window: build a moat around your data, because that’s the only thing AI won’t commoditize. As long as it cannot get it.

MinIO’s move seems pragmatic given the conditions.

More will follow.

THE END


Postscript A: If it weren’t for bad ideas, I’d have no ideas at all

I have more rambling thoughts. If you read this far, you might as well continue reading, but I simply ran out of time to organize or edit them.

I believe in steel-manning my own blind spots, so here they are.

The 2021 AGPL switch predates any credible AI code generation threat. That move was classic dual-licensing. My AI thesis seems to explain the 2025-2026 acceleration, not the original license change.

The barn door was already open; coming from a Barnson, that’s saying something. MinIO’s code has been in training corpora for years. Archiving the repo stops new commits from being ingested, but it does not un-train existing models. If the goal is to protect the code from AI, they are years too late.

Courts could go the other way. If they rule that training on copyrighted code constitutes copying rather than fair use, AGPL gets stronger, not weaker. The commercial license becomes more valuable, not less. The whole thesis flips.

And the simplest explanation deserves consideration: maybe they just could not monetize the community edition and stopped spending on it. The AI angle adds sophistication to what might be a straightforward P&L decision.

I think of this as a testable prediction, not an established fact.

  • If MinIO starts shipping code that looks AI-generated
    • (good luck figuring that out if it’s closed source!),
    • and if the security posture degrades without public scrutiny,
    • and if the headcount continues to contract while ARR claims grow?
    • those are confirming signals.
  • If they
    • hire aggressively,
    • ship great software,
    • and the Glassdoor reviews improve?
    • then I was probably wrong, and I’d be happy to say so.

The company is headed by a crew with proven open-source chops, and they deserve success if it can be found in today’s enterprise storage hellscape.

Postscript B: for my fellow devs

If you are running MinIO in production today? The upstream repo is archived and the README says it is no longer maintained. The Community Edition is source-only with no precompiled binaries and no admin UI. If you want the batteries-included experience, you are looking at AIStor subscription pricing. Your data is probably portable (S3 is a well-defined API), but the switching cost is real and it scales with how deep you went. Figure it out and I’d suggest you get the hell out.

If you are building something new? Six years ago this was a permissive, community-friendly project. It had hundreds of millions of Docker pulls and the default recommendation on Stack Overflow. MinIO’s move last night makes me feel much like I did watching the animatronic corpse of Peter Cushing play Grand Moff Tarkin in Rogue One… creepy and off-putting. I skip that scene. Unless I want to admire what artists could do in an era when “Will Smith Eating Spaghetti” was a funny/weird AI thing instead of looking more photorealistic than me taking a video of myself in my kitchen.

If you are an open-source developer? The deeper problem is not really MinIO. It’s the enforcement mechanism. The ability to prove “derivative work” is eroded by AI code generation faster than anyone is building replacements. Your open-source app monetization strategy is the Disappearing Lucy’s Football. The economic infrastructure that sustains open-source development needs to be replaced.

I don’t have a fix for that. I don’t think anyone does. Maybe open-source of the future is all on Patreon or something.

I recently started creating independently again: code, music, and now blogging. It is a much healthier place for me than trying to lead a team. The irony? The same language models undermining code copyleft are the ones making it possible for me to create more, faster, and at higher quality than before. Without worrying if some old code is living rent-free in my head.

The world is weird.

And if you give money to organizations that defend software freedom: give it to the EFF.

I’ve been busy

So, yeah. Eleven years ago I basically stopped blogging. The reasons were many: three jobs, insufficient income, and I started a new gig that reminded me repeatedly and pointedly that having a public presence on the Internet if I was not paid to do so was a liability. Plus I’d gotten into alternative social media channels for a while, then abandoned them completely as I got immersed in my work and other hobbies.

But lately, I’ve found a lack of authentic voices on the internet. The Internet is such a clickbait-farming ad-supported wasteland of barely-readable text. When I try to read thoughtful articles, they are often surrounded top and bottom by ads around a meager fraction of a paragraph before I must scroll, tap, tap, tap, and scroll again to read the darn thing. Or I turn on “Reader Mode” and get the first paragraph and a subscription link so I can get more spamvertising in my inbox.

So anyway. Here’s my tiny little bump on the corner of the Internet, trying to provide valuable content that interests me. Frequency TBD. Topics TBD. But I hope you find it interesting.

Privacy in a coffee shop

So I have to post you about two things — the outcry regarding FB privacy abuses, and the state’s political response in response to that outcry.

https://www.wsj.com/articles/for-facebooks-employees-crisis-is-no-big-deal-1523314648

So I have to post you about two things — the outcry regarding FB privacy abuses, and the state’s political response in response to that outcry.

https://www.wsj.com/articles/for-facebooks-employees-crisis-is-no-big-deal-1523314648

I don’t understand. You put your info and personal intimacies on FB. For years. For 10 years. Everything about yourself. For free. You do all this for free, putting your life online for 10 years. And then you complain when the internet service provider you’ve been using for free harvests your information? Like all of a sudden your privacy has been violated?

While privacy is at the forefront of the issue, the underlying tenet to me is the value of self-information and the value of the transfer of self-information by which that privacy is being asserted.

If you and I were in a coffee shop, and we were trying to have a private conversation, and we noticed someone listening in and eavesdropping on our conversation…we’d kick their ass!!! But seriously, the coffee shop is a place where people barter exchange with the shop for food and drink. Except there’s something else going on. There’s space to sit and work and relax. There’s wifi. But you don’t have to buy something from the shop to use its internet connection or to sit or meet with others or to transact personal and private business. The coffee shop proprietor isn’t demanding you buy something to use its other various services.

So people in coffee shops all the time assemble in these private-public spaces and yammer away about sensitive personal details with everyone in ear’s distance hearing it. And this doesn’t even cover the supremely annoying people yammering away loudly on their phones.

FB is the internet’s coffee shop. And everyone is hanging out at this place a lot, A LOT, and yammering away about their personal lives, and accepting that since they’re not buying and have never bought anything from the food counter that the shop is making money by taking all that yammering, which is being given to it for free, and turning around and selling that yammering to advertisers.

When you give your data for free in exchange for service you assign an informational value of free to yourself. Your data and your privacy is worth free to you. That is what is implicit to me. The implicit statement is: my privacy and personal data is worth nothing because I am giving it freely to a service I am using while knowing that service makes money off of advertising from me giving my data and from me not asking anything in return from the service making money off my data.

That value exchange of self-information seems to me to be the same whether you stop in the shop one time or stop in one time a minute. The rate of exchange remains the same. The transfer volume of self-informational doesn’t alter the value of the self-information being zero.

So you’ve been going to this coffee shop for some time and had a general chat one day about how hard it is to get your foot in your shoes and the very next day you show up and at the table you’re sitting is an advert for shoehorns and other fine accessories. And this goes on for a while until related ads start showing up the minute after you mention a specific topic. At what point do you get up and leave the coffee shop and never come back? Especially when you aren’t being forced to use this shop and there are other shops which provide similar service?

A year ago it became widely known that foreign nations were scraping data from this shop and buying political ads to influence the presidential election. Last month it became widely known that companies were indeed harvesting data from this shop to service those political persuasion campaigns.

Guess what? No one is leaving the coffee shop. A free and non-coerced civic polity continues to give away their data for free.

When something is free then you are the product. And you have assigned your own self-information to be worth $0. So either leave the shop and never go back, or keep going to the shop and know what you are in for. Because it’s not called PrivacyBook.

So, again, this is what I don’t understand. People put personal intimacies on FB for years. For free. And all of a sudden their privacy has been violated?

But then…something far far FAR worse happens. The government decides it must intervene and assert authority. Overstepping its role by somehow protecting people from their own lack of self-awareness. The government is not our Mom and Dad. The American people are not teenagers. The same thing happened when various levels of government tried to block the rise of Uber and AirBnB. Not only is society using these services but society is defending its right to exist by using it without reservation. So let them do it. If people have a problem with privacy violations, and there is no illegal activity taking place, then let the people work it out.

(Side note: it just shocks me that politicians, particularly conservative ones, would inject themselves into the fray by attacking a corporate juggernaut and cornerstone of the American economy. While the privacy issue does seem in some ways a media hype job, per the above WSJ article, I’m surprised a conservative administration and legislative leadership is letting this attack happen. But that’s today’s world when all you care about is votes and not principle.)

The bottom line is that I miss the community on Barnson.org. I understand my sentiment may not just be old-fashioned but a fossil emotion in the hyper-now digital world that is instantaneous, widespread engagement. But I don’t care. If FB went away tomorrow I wouldn’t miss 90% of the people who are my tagged ‘friends’ at that coffee shop. I miss this coffee shop. I miss the people I know and care about, and the quasi-privacy of our thoughtful, considerate conversation and debate within the back corner of the bigger shop that is the internet.

Trump revokes Washington Post’s campaign press credentials

So I have to post you. I’m no Trump supporter but I did happen to hit the WP yesterday when the headline “Donald Trump suggests President Obama was involved with Orlando shooting” was live.

http://mobile.reuters.com/article/newsOne/idUSKCN0YZ2DA

So I have to post you. I’m no Trump supporter but I did happen to hit the WP yesterday when the headline “Donald Trump suggests President Obama was involved with Orlando shooting” was live.

http://mobile.reuters.com/article/newsOne/idUSKCN0YZ2DA

I was way shocked. I couldn’t believe that to be true. So I went to view Trump’s speech and nowhere did Trump say, at all, that Obama was involved with the Orlando shooting.

Of course I don’t condone revoking press credentials. But I do observe how for the past several months the WP has been unusually harsh and increasingly biased towards and against Trump. The WP has gone from reporting the news to reporting their bias. My guess is the WP is doing this out of some internal crusade to protect journalism and defy those who would curtail a free press.

But that’s not the point of my posting you. The point is that I feel neither WP nor the Trump campaign realize how this continued siege of negative reporting HELPS Trump. I feel there are many people out there, the DC-dislikers, who consider the negative reporting to be coming from a source representative of a congressional institution they want to change. To these DC-dislikers, the WP is mainstream and legacy media feeding their enmity. The more negative the reports against Trump the more the DC-dislikers dig their heels into their minds and become more aligned with Trump. It’s a strange and warped psychological situation.

And basically I see two mistakes. I see the editorial mistake of the WP failing to report activity and static detail, almost allowing the aggressive virility of the late Hunter Thompson to seep into their writing. And I see the tactical mistake of the Trump campaign assessing a negative coercion power legacy media believes it still wields.

Handy Space Monitoring on ZFSSA

This is a re-post from my blog at http://blogs.oracle.com/storageops/entry/handy_space_monitoring

Semi-real-time space monitoring is pretty straightforward with
ECMAScript & XMLRPC.  I’ve never really been a fan of using used
+ avail as a metric; it’s simply too imprecise for this kind of
work.  With XMLRPC, you can gauge costs down to the byte, and with
Javascript/ECMAScript you have some easy date handling for your
report.

This is a re-post from my blog at http://blogs.oracle.com/storageops/entry/handy_space_monitoring

Semi-real-time space monitoring is pretty straightforward with ECMAScript & XMLRPC.  I’ve never really been a fan of using used + avail as a metric; it’s simply too imprecise for this kind of work.  With XMLRPC, you can gauge costs down to the byte, and with Javascript/ECMAScript you have some easy date handling for your report.

Here’s a code snippet to monitor fluctuations in your overall pool space usage.  Just copy-paste at the CLI to run it. Let’s call this "Matt’s Handy Pool Space Delta Monitor".  This one will update every 5 seconds; just change the "sleep" interval to whatever you need to increase or decrease the update speed; press CTRL-C a few times rapidly to exit.

There must be a way to get the ECMASCript interpreter to break out of the whole loop in response to a CTRL-C the first time, rather than just breaking the current loop requiring multiple CTRL-C presses, but I’m not exactly certain how to do it:

script
var previousSize = 0,
  currentSize = 0;
while (true) {
  currentDate = new Date();
  currentSize = nas.poolStatus(nas.listPoolNames()[0]).np_used;
  printf(‘%s bytes delta: %s bytes\n’,
    currentDate.toUTCString(),
    currentSize – previousSize);
  previousSize = currentSize;
  run(‘sleep 5’);
}
.

Here’s some sample output from a very busy system which handles some of Oracle’s ZFS bundle analysis uploads.  The system is constantly extracting, compressing, and destroying data, so it’s pretty dynamic.

aueis19nas09:> script
("." to run)> var previousSize = 0,
("." to run)>   currentSize = 0;
("." to run)> while (true) {
("." to run)>   currentDate = new Date();
("." to run)>   currentSize = nas.poolStatus(nas.listPoolNames()[0]).np_used;
("." to run)>   printf(‘%s bytes delta: %s bytes\n’,
("." to run)>     currentDate.toUTCString(),
("." to run)>     currentSize – previousSize);
("." to run)>   previousSize = currentSize;
("." to run)>   run(‘sleep 5’);
("." to run)> }
("." to run)> .
Wed, 08 Jul 2015 17:44:31 GMT bytes delta: 102937482702848 bytes
Wed, 08 Jul 2015 17:44:36 GMT bytes delta: 0 bytes
Wed, 08 Jul 2015 17:44:42 GMT bytes delta: 362925056 bytes
Wed, 08 Jul 2015 17:44:47 GMT bytes delta: 1039872 bytes
Wed, 08 Jul 2015 17:44:52 GMT bytes delta: 424662016 bytes
Wed, 08 Jul 2015 17:44:57 GMT bytes delta: -181739520 bytes
Wed, 08 Jul 2015 17:45:02 GMT bytes delta: 0 bytes
Wed, 08 Jul 2015 17:45:07 GMT bytes delta: -362792960 bytes
Wed, 08 Jul 2015 17:45:13 GMT bytes delta: -56487936 bytes
Wed, 08 Jul 2015 17:45:18 GMT bytes delta: 0 bytes
Wed, 08 Jul 2015 17:45:23 GMT bytes delta: 311884288 bytes
Wed, 08 Jul 2015 17:45:28 GMT bytes delta: -3111936 bytes
Wed, 08 Jul 2015 17:45:33 GMT bytes delta: 329170944 bytes
Wed, 08 Jul 2015 17:45:38 GMT bytes delta: 94827520 bytes
Wed, 08 Jul 2015 17:45:44 GMT bytes delta: -24576 bytes
Wed, 08 Jul 2015 17:45:49 GMT bytes delta: 356221440 bytes
Wed, 08 Jul 2015 17:45:54 GMT bytes delta: -36864 bytes
Wed, 08 Jul 2015 17:45:59 GMT bytes delta: 503583744 bytes
Wed, 08 Jul 2015 17:46:04 GMT bytes delta: 175494144 bytes
Wed, 08 Jul 2015 17:46:10 GMT bytes delta: -342528 bytes
Wed, 08 Jul 2015 17:46:15 GMT bytes delta: 135242240 bytes
Wed, 08 Jul 2015 17:46:20 GMT bytes delta: -39769600 bytes
Wed, 08 Jul 2015 17:46:25 GMT bytes delta: -124416 bytes
Wed, 08 Jul 2015 17:46:30 GMT bytes delta: -136044544 bytes
^CWed, 08 Jul 2015 17:46:31 GMT bytes delta: 0 bytes
^C^Cerror: script interrupted by user
aueis19nas09:>

Caveats:

  • This isn’t actually a 5-second sample; it simply sleeps 5 seconds between sample periods, and due to execution time you will probably get a little drift that will manifest as a displayed interval of 6 seconds here & there if left running a long time.
  • If you wanted to modify this to be GB instead of bytes, you’d replace "currentSize – previousSize" with something like "Math.round((currentSize – previousSize) / 1024 / 1024 / 1024)", but that will probably just end up with a string of 0 or 1 results with such a short polling interval.  You’d need to see significant and rapid data turnover to get a non-zero result if polling by gigabyte every five seconds!
  • This only monitors the first pool on your system. To monitor other pools on your system, you’d change "nas.listPoolNames()[0]" to "nas.listPoolNames()[1]" or whatever number the pool you want to monitor is in response to the "nas.listPoolNames()" command.

Enjoy!

Stuff Blog: Day 1

So I decided to create a “Stuff Blog” to document my adventure trying to sell down all the stuff in my life. Most of it I don’t need, and I want to get rid of as much as is practical.

So I decided to create a “Stuff Blog” to document my adventure trying to sell down all the stuff in my life. Most of it I don’t need, and I want to get rid of as much as is practical.

Day 1 listings: My Garmin VivoSmart smart watch and my ChromeCast. Yeah, I know they are both small personal electronics; I’m going to try something larger tomorrow. Like maybe an old bed or an old desk or something.

Understanding the Oracle Backup Ecosystem

Mirrored at https://blogs.oracle.com/storageops/entry/understanding_the_oracle_backup_ecosystem

Mirrored at https://blogs.oracle.com/storageops/entry/understanding_the_oracle_backup_ecosystem

Table of Contents

Understanding the Oracle Backup Ecosystem

Backup/Restore Drivers

The “Oops”

Defending against and pursuing lawsuits

Taxes & Audits

Disaster Recovery

Reduce Downtime

Improve Productivity

The Backup/Restore Tiers

Tier 1 Backups

Tier 2 Backups

Tier 3 Backups

Tier 4 Backups

The Tools

ZDLRA

SMU

OSB

ACSLS

STA

Oracle ZFS Storage

Tools For Tiers

Understanding the Oracle Backup Ecosystem

A frequent question I hear these days is something along the lines of “How is Oracle IT leveraging the Zero Loss Data Recovery Appliance, Oracle Secure Backup, and ZFS together?”

Disclaimer 1: The opinions in this blog are my own, and do not necessarily represent the position of Oracle or its affiliates.

Disclaimer 2: In Oracle IT, we “eat our own dog food”. That is, we try to use the latest and greatest releases of our product in production or semi-production environments, and the implementation pain makes us pretty strong advocates for improvements and bug fixes. So what I talk about here is only what we’re doing right now; it’s not what we were doing a year ago, and probably won’t be what we’re doing a year from now. Some of today’s innovative solutions are tomorrow’s deprecated processes. Take it all with a grain of salt!

Disclaimer 3: I’m going to talk about some of my real-world, actual experiences here in Oracle IT over the past decade that influenced my position on backups. Don’t take these experiences as an indictment of our Information Technology groups. Accidents happen; some are preventable, some not. The real key to success is not in not failing, but in moving forward and learning from the experience so we don’t repeat it.

Backup/Restore Drivers

Typically, the need for offline backup & restore is driven by a few specific types of needs.

The “Oops”

Humans are fallible. We make mistakes. The single most common reason for unplanned restores in Oracle IT is human error. This is also true for other large enterprises: Google enjoyed a high-profile incident of corrupted mailboxes several years ago due to a flawed code update. Storing data in the “cloud” is not a protection against human error. The only real protection you have from this kind of incident is some kind of backup that is protected by virtue of being either read-only or offline.

Defending against and pursuing lawsuits

In today’s litigious environment, being able to take “legal hold” offline, non-modifiable, long-retention backups of critical technology is a prerequisite to efficiently defending you and your company from various legal attacks. Trying to back up or restore an environment that has zero backup infrastructure in place is a huge hassle, and can endanger your ability to win a lawsuit. You want to have a mechanism in place to deal with the claims of your attackers – or to support the needs of your Legal team in pursuing infringements – without disrupting your normal operations.

Taxes & Audits

Tax laws in various countries usually require some mandatory minimum of data retention to satisfy potential audit requirements. If you can’t cough up the data required to pass an audit – regardless of the reason, even if it’s a really good one! – you’re probably facing a stiff fine at a minimum.

Disaster Recovery

I’m going to be real here. This is my blog, not some sanitized, glowing sales brochure. Everybody is – or should be! – familiar with what “Disaster Recovery” is. Various natural and man-made disasters have happened in recent decades, and many companies went out of business as a result due to inadequate disaster recovery plans. While the chance of a bomb, earthquake, or flood striking your data center is probably very low, it does exist. Here’s a short list of minor disasters I’ve personally observed during my career. There have been many more; I’ll only speak of relatively recent ones.

  • A minor earthquake had an epicenter just two miles from one of our data centers. I was in the data center in question at the time; it felt as if a truck struck the building. Several racks of equipment didn’t have adequate earthquake protection and shifted; they could easily have fallen over and been destroyed.

  • An uninterruptible power supply’s automated transfer switch exploded, resulting in smoke throughout the data center and a small fire that could have spread and destroyed data.

  • Another data center had a failure in the fire prevention system, resulting in sprinklers dousing several racks worth of equipment.

  • Busy staff and a flawed spreadsheet resulted in the wrong rack of equipment being forklifted and shipped to another data center.

  • A data center was in the midst of a major equipment move with very narrow outage windows. During one such time-critical move, facilities staff incorrectly severed the ZFS Appliance “Clustron” cables with a box knife before shipping the unit. I powered the unit up without detecting the break, resulting in a split-brain situation on our appliance that corrupted data. Mea culpa! Seriously, don’t do that. I don’t think the ZFSSA is vulnerable to this anymore as a result of this incident, but it was painful at the time and I don’t want anyone to go through that again…

  • Multiple storage admins on my team have accidentally destroyed the wrong share or snapshot on a storage appliance. When you have hundreds of thousands of similarly-named projects, shares, and snapshots, it’s nearly inevitable, even if the “nodestroy” bit is set: if the service request says to destroy a share, and all the leadership signed off on the change request for destroying it, you destroy it despite the “nodestroy” thing. But it’s quite rare.

  • Admins allowed too many disks to be evicted from the disk pool on an Exadata because ((reasons, won’t go into it)), resulting in widespread data loss and a data restore.

This was the minor stuff. Imagine if it were major! If you don’t have solid, tested disaster recovery plans that include some kind of offline or near-line backup, you’re exposed and are likely to go out of business even if you suffer a user-induced disaster such as the “Oops” category above.

Reduce Downtime

Having a good backup means that you have less downtime for your staff in case of any challenge with your data. Knowing how long it takes to restore your data is a benefit of a regularly-scheduled restore test.

Improve Productivity

Finally, if you don’t have a good backup, the chance is high that you’ll eventually end up having to do some work over again due to lack of good back-out options. This loss of productivity hurts the bottom line.

The Backup/Restore Tiers

In any large enterprise environment, there exist multiple tiers of needs for backup/restore. It’s often helpful to view backup and restore as a single type of tier: if your backup needs tend to be time-sensitive, your restore needs are probably even more so. Therefore, in the interest of simplicity I’ll assume your tier need for restores mirrors your tier for backups.

Here’s how I view these tiers today. They aren’t strictly linear as below – there is a lot of cross-over – but they align nicely with the technologies used to back them up.

  1. Mission-critical, high-visibility, high-impact, unique database content.
  2. Mission-critical, high-visibility, high-impact, unique general purpose content.
  3. Lower-criticality unique database and general purpose content.
  4. Non-unique database and general purpose content.

Tier 1 Backups

For Tier 1 Oracle database backup and restore, there exists one best choice today: The Zero Data Loss Recovery Appliance, or "ZDLRA". While you can perform backups to ZFS or OSB tape directly – which works quite well, and we’ve done it for years in various environments – the ZDLRA has some important advantages I’ll cover below.

That said, though, the Oracle ZFS Storage Appliance in combination with Oracle Secure Backup can provide Tier 1-level backups, but the “forever-incremental” strategy available on ZDLRA is simply not an option. For Tier 1 non-ZDLRA backups, we resort to more typical strategies: rman backup backupset using a disk-to-disk-to-tape approach, NFS targets, direct-to-tape options, etc.

For Tier 1, you also want multiple options if possible: layer upon layer of protection.

Tier 2 Backups

For Tier 2 general-purpose content, the ZDLRA just isn’t particularly relevant because it doesn’t deal with non-Oracle-Database data. By calling it “Tier 2” I’m not implying it’s less important than Tier 1 backups, just that you have a lot more flexibility with your backup and recovery strategies. Tier 2 also applies to your Oracle database environments that do not merit the expense of ZDLRA; ZFS and tape tend to be considerably cheaper, but with a corresponding rise in recovery time and manageability.

In Tier 2, you’ll have the same kind of backup & restore windows as Tier 1, but will use non-ZDLRA tools to take care of the data: direct-to-tape backups, staging to OSB disk targets for later commitment to tape, etc. Like Tier 1, you want to layer your recovery options. Our typical layers are:

  1. Sound change management process to eliminate the most common category of “Oops” restores.

  2. Snapshots. Usually a week or more, but a minimum of 4 daily automated snapshots to create a 3-day snap recovery window.
  3. Replication to DR sites. For Oracle Database, this usually means “Dataguard”. For non-DB data, ZFS Remote Replication is commonly used and has proven exceptionally reliable, if occasionally a little tricky to set up for extremely large (100+TB) shares.
  4. For Oracle databases, an every-15-minutes archive log backup to tape that is sent off site regularly at the primary and DR site(s).
  5. Weekly incremental backups to tape, using whatever hot backup type of technology is available to us on the platform so that a backup is “clean” and can be restored from without corrupted in-flight data at both the primary & DR site(s).
  6. Monthly full backups to tape at both the primary & DR site(s).
  7. Ad-hoc backups to tape as required.

Tier 3 Backups

Leveraging the same toolset as Tier 2 backups, Tier 3 backups are simply environments that need less-frequent backups of any sort. It’s the kind of stuff that if you lost access for 12-24 hours, your enterprise could keep running but would inconvenience a bunch of users. It’s not stuff that endangers your bottom line – if it’s a revenue-producing service, it must be treated as Tier 1 or Tier 2, or else you might end up owing your customers some money back! – but would be painful/irritating/time-consuming to reproduce.

In Oracle IT, this tier of data receives second-class treatment. It gets backed up once per week instead of constantly. Restore windows range from a few hours to a couple of days. Retention policies are narrower. Typically, very static environments like those held for Legal Hold or rarely-read data are stored in this tier. The data is important enough to back up, but the restoration window is much more fluid and the demands infrequent.

ZFS Snapshots are critical for this kind of environment, and typically will be held for a much longer period than the few days one might see in a production environment. Because the data is much more static, the growth of snapshots relative to their filesystems is very low.

Tier 4 Backups

The key phrase for backups in this tier is “non-unique”. In other words, the data could easily be reproduced with roughly the same amount of effort it would take to restore from tape. In general, these Tier 4 systems don’t receive much if any backup at all. ZFS snapshots occur on user-modifiable filesystems so that we can recover within a few days from a user “oops” incident, but if we were to lose the entire pool it could be reconstructed within a couple of days. Although it’s important to have some mechanism for tape backup should one be required, they will be the exception and not the rule.

The Tools

Now to the fun part. How do we glue these things together in various tiers? What tools do we use?

ZDLRA

  1. The forever-incremental approach to backups means that there is less CPU and I/O load on your database instance. Backup windows typically generate the heaviest load on your appliance, and since the ZDLRA should never require full backups after the first one, it’s an outstanding choice for I/O-challenged environmental backups.

  2. The ZDLRA easily services a thousands-of-SIDs environment without backup collisions. This is really critical for Cloud-style environments with many small databases, where traditional rman scheduling tends to fall apart pretty easily due to schedule conflicts to limited tape resources.
  3. Autonomous tape archival helps aggregate backups and provide on-demand in-scope Legal Hold, Disaster Recovery, Environment Retirement, and Tax/Audit backups to tape. Many may think “tape is dead”… but they think wrong!

SMU

Oracle’s SMU – “Snap Management Utility – is a great way to back up Tier 2 Oracle databases to ZFS. It handles putting your database into hot standby mode so that you can take an ACID-compliant snapshot of the data and set up restore points along the way. If you can’t afford ZDLRA, SMU + ZFS is a great first step. Just don’t forget to take it to tape too!

OSB

OSB version 12 provides “Disk Targets”. This, in essence, gives users of OSB 12 a pseudo-VTL capability. This new Disk Target functionality provides some other unique benefits:

  1. Aggregate multiple rman backups of smaller-than-a-single-tape size onto a single tape.

  2. With sufficient streams to disk, you can be rid of rman scheduling challenges that often vex thousands-of-SIDs environments when backing up to tape.
  3. By aggregating rman and other data to a single archive tape, you increase the density of data on tape, avoid buffer underruns, and maximize the free time for your tape drive. What often happens with a slow rman backup is that the tape ramps its speed down to match the input stream, doubling or even quadrupling the time the tape drive is busy. By buffering the backups to disk first, you can ensure the tape drive is driven at maximum speed once you’re ready to use “obtool cpinstance” to copy those instances to tape.
  4. Ability to use any kind of common spindle or SSD storage as a disk target. We use a combination of local disks on Sun/Oracle X5-2L servers running Solaris as well as ZFS Storage Appliance targets over 10gbit Ethernet

ACSLS

Oracle’s StorageTek Automated Cartridge System Library Software – ACSLS for short – provides a profoundly useful capability: virtualization of our tape silos. We can present a single silo from our smaller SL3000 libraries to the Big Boy SL8500 library as a virtual tape silo to a given instance of OSB. This allows truly isolated multi-tenancy and reporting for individual customers or lines of business. This capability is leveraged to the max across all of our Enterprise, Cloud, and Managed Cloud environments.

STA

Oracle’s StorageTek Analytics (STA) provide predictive failure analysis of tapes and silo components. All storage – tape, SSD, and magnetic spindle – will fail eventually. STA provides valuable insight into the rate of this decay, and works in tandem with ACSLS to pro-actively, predictively fail media out of the library when it’s no longer reliable.

Oracle ZFS Storage

Oracle’s ZFS Storage Appliance provides a uniquely flexible, configurable storage platform to leverage as a disk backup target, rman “backup backupset” staging area for massive-throughput Oracle database backups, remote replication source or target, and more. The proven self-healing capabilities of Oracle’s ZFS storage – particularly effective in a once-in, many-out backup situation – helps guarantee that backups are healthy and exactly what you intended to commit to tape. In many ways, the ZFS Storage Appliance is the fulcrum around which all our other utilities rotate, and its seamless integration as a disk target for OSB over either NFS or NDMP is simple, straightforward, and provides unparalleled analytic ability.

Tools For Tiers

If you’ve read this far, you probably already have a pretty good idea of what to use for which tier. ACSLS, STA, ZFS, and OSB all factor into every tier of backups in one way or another. By tier:

  1. ZDLRA with a sub-15-minute recovery point objective.

  2. ZFS Snapshots, hot backups to tape and/or OSB Disk Targets, and for some specific environments SMU may be appropriate, with a 15-minute recovery point objective.
  3. ZFS Snapshots are the primary “backup”, with a far more generous 24-hour recovery point objective using OSB disk and tape targets.
  4. ZFS Snapshots as the primary or only “backup”; no specific recovery point objective as the environment could be reconstructed if necessary.

I hope this is helpful for you when figuring out how to back up your Red Stack. All the best!

“The Flaw”

Just watched “The Flaw”. It’s an entertaining and surprisingly unbiased documentary covering the myriad causes of the 2008 financial disaster from which the world is still recovering.

Just watched “The Flaw”. It’s an entertaining and surprisingly unbiased documentary covering the myriad causes of the 2008 financial disaster from which the world is still recovering.

The most startling realization of the film for me is that from 1977 to 2007 the American people collectively engaged in the largest redistribution of wealth in world history, transferring money from the poorest 65% to the top 1%, from people who would spend the money to those who tend to invest the money rather than spend it. And we did all of this VOLUNTARILY through debt.

The second most startling realization is that we are still doing this. And it’s accelerating. The poorest among us are once again making the richest richer, and the richest are once again investing in more debt-based money-generating vehicles based on asset bubbles rather than investing in things that have worth due to their utility. All because, ultimately, exploitative debt-based real estate securities generate far more short-term profits than investing in factories and technologies that make real, tangible stuff.

Enjoy the respite from the housing bubble, folks. It’s still ongoing, and we’re still pumping twenty billion dollars a month into trying to keep the illusion of wealth growth through home appreciation for the middle-class rather than real, tangible wage increases and innovation with production.

My thoughts on the Apple Watch keynote

Watched the keynote today. Am I going to get an iWatch? No. Here’s why:

Watched the keynote today. Am I going to get an iWatch? No. Here’s why:

1. 18-hour “typical day” battery life. Ouch. I expect a watch to last at least a full day on a charge, and less if I’m tracking a fitness activity with it (but I still expect 10+ hours during fitness activities). From early reports, under heavy use this “18 hour” battery life is really about two hours; there’s a reason the very first accessory available for the watch is an expansion battery. 2. Patents have pretty well locked up the optical heart rate market, so unless Apple licensed one of the two major patent-holders, the optical heart rate is going to be terribly inaccurate under heavy motion, high heart rates, sweat, and for those with dark skin. 3. No waterproofing. Just splash-resistance. This is the deal-breaker for me. My fitness watch needs to be able to go into the pool, reservoir, or ocean and be 100% fine in an unexpected downpour when I’m on the bike or the run. 4. Total dependence on an iPhone. I want my wearable to track movement, distance, and activities even if I choose to leave the phone at home while hitting the weights, pool, bike, or track.

You won’t notice “price” on my list. Like most Apple products, when you evaluate the capabilities, weight, and feature set at day of release, Apple products are actually very competitive. At $349, I think it’s going to sell like gangbusters, with a compelling feature set that eclipses much of the similarly-priced competition.

And I hope they sell a gazillion of them so they can eventually address the needs of multisport athletes.

Maybe in version 2.0. Or 3.0…

2015 Mock Sprint Tri Results

I had some issues with my Garmin 910xt, but eventually I fixed the mock tri file. Woot! Next time, I’ll disable all auto lap functionality before starting the tri, because apparently that’s what interferes with the run data & corrupts the file.

I had some issues with my Garmin 910xt, but eventually I fixed the mock tri file. Woot! Next time, I’ll disable all auto lap functionality before starting the tri, because apparently that’s what interferes with the run data & corrupts the file.

Total moving time (not stopped @ stoplights): 112 minutes (1 hr, 52 minutes). Or more or less totally in line with most average beginner times, with a slightly better bike and a considerably worse run. Not at all unexpected.

* Mock Swim: 7:29. https://connect.garmin.com/modern/activity/715055281 T1: 7:26.https://connect.garmin.com/modern/activity/715055283 . I will do way better than this if I’m not DRIVING from the pool to my house for T1. * Mock Bike: 47:49 https://connect.garmin.com/modern/activity/715055284 T2: 2:05 https://connect.garmin.com/modern/activity/715055285 * Mock Run: 47:07 https://connect.garmin.com/modern/activity/715055286 (This is the totally broken part)

Glad to have the data & compare it to my first super-sprint from last year: * RCStake Swim leg I’m twice as fast (it was 300m 6x50m, not 700m): https://connect.garmin.com/modern/activity/560790985 * RCStake Bike leg 2MPH faster: https://connect.garmin.com/modern/activity/560790991 * RCStake run leg: OK, I was a little slower today than on the run leg last year. But the mock tri is nearly twice the length. https://connect.garmin.com/modern/activity/560790995 .

Observations: * My 910xt is finally recognizing my swim strokes as freestyle instead of backstroke! This means my form work is starting to pay off. And those laps I did do backstroke are almost twice as slow as freestyle, which clearly tells me I need to avoid backstroking if at all possible; a slow freestyle is faster than my fastest backstroke! * I blew up my legs on the uphill bike leg and didn’t work nearly hard enough on the back half of the ride while mostly cruising dowhill. My calves cramped up on the first part of the run, probably from under-use on the second half of the bike ride. * I need to learn to aero, or spend more time in the drops. I spent maybe 25% of my time (or less) in aero on my road bike. Sure, they are just little shorty aero bars, but nonetheless it was windy and I think it would have helped. * Hydration & electrolytes were OK, but I think I’d do better with some timed nutrition: a little EFS electrolyte drink before the swim, a little on the bike, and my energy levels should stay a little more consistent on the run. More mental than physical, I think. * Transitions were rough. Going to optimize them a bit for my first sprint in two weeks. * Too much hotfoot & walking on the run. I should use my metatarsal pads on the bike ride and probably Vibrams instead of my clunky running shoes on the run. My turnover will be quicker, and for such a short duration on the run it should help avoid the hotfoot I often get on longer runs well over an hour.

Excited. Clearly I *can* finish the sprint tri in a reasonable amount of time, and I’m pretty certain there will be at least a few non-DNF people behind me at the end. Which is really all I can ask 🙂 — Matthew P. Barnson http://barnson.org/