Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to destroy your OS with tar (vorakl.com)
98 points by vorakl on May 23, 2024 | hide | past | favorite | 61 comments


Interesting read. Thanks for sharing. Maybe it is by lack of experience, but I always treat tarballs like a loaded gun. I extract then inside an empty subdirectory in my home first just to be sure and then move the data as required. It is no fun having to cleanup the mess left by an incorrect extraction.


This is a good and safe practice. And I think most people do it this way after cleaning up the mess of a badly built tarball at least once :)


I'd argue this is a design bug. Extracting into the current directory should either not be possible or be the exception (tar xvfz --current-directory) and not singling out tar here, unzip, pkunzip, etc all have this issue and have all cause people data loss and worse because if this default unsafe behavior


The fact that any program will overwrite files by default is also terrible.


Unless there is only one folder in the archive, and it's not overwriting anything, then it should be extracted into the current directory so you don't get nested dupes


Lots of things looks like a design bug today but were a prudent choice for the environment at the time.

Like when I have an electrician at me house. Says the old way was dumb... but that was fully up to code in 1954.

Time makes fools of us all.


That might be true but I've using arc, tar, pkzip since the 80s and even then I lost work and had to clean up on floppy disks because of this issue. I suppose the prudent thing is to list the files before decompressing


Sure (and me as well) but, design bug? Maybe just different expectations of the user.

On that point, some heavy machinery took years before adding safety guards (some even required legislation before improving safety)


Even `rm -rf *` warns you before just doing it.


I have 30 years of professional experience and with one off tarballs that I'm not deeply familiar with, this is usually what I do as well (certainly with a tarball that has a /usr like structure inside of it). You're good


Most distros should have a useful Linux package "atool", containing the command "aunpack" which does the least surprising thing for all archive types (without creating duplicate root when it already exists).


I believe the unar tool creates a containing directory by default.


Extracting archives directly into your system root as a superuser is in the same class of activity as piping curl output into your shell interpreter as a superuser: things that no one should ever do.


> I extract then inside an empty subdirectory in my home first

... AFTER a "tvf[jzx]", I hope


What's the gain of -t if the extraction target is disposable?

Does -x have some side effect that -t would list for you?


It's nice to check if you're about to extract a 1MB tarball into 2TBs of data before actually running out of disk space.

Most tar programs do prevent extracting tarballs containing absolute paths (like '/etc/passwd') and relative paths (like '../../../etc/passwd'), but older tar programs still allow that. And programs written in Go, because of course: https://github.com/golang/go/issues/55356

Overall, if your HDD size is infinite and you're using GNU tar, or another recent tar, you can skip 't' I think before doing a '-C' extraction into some safe directory.


How do you create a tar archive that "contains absolute paths"?


The tar file format doesn't prevent you from specifying absolute paths in the archive. It's up to the tool extracting the archive to reject/ignore such paths.


I asked about options for GNU tar because there is a bit of strange behavior.

To add absolute paths to an archive, there is "-P" option, and man says it works only for creating archives: "Don't strip leading slashes from filenames when creating archives".

To extract absolute paths from the archive, you need to add the "-C /" option, and although the tool says "tar: Strip leading `/' from member names", it will still extract it in the right place because the paths become relative and -C puts them in the root.

However, if you add "-P" during the extraction (which is not mentioned in man), the "strip leading slashes" information disappears.

So if this message bothers someone, "tar -C / -xPf file.tar" will cleanly extract absolute paths from the archive ;)


The first field in the tar header is

    char name[100];
(See https://man.archlinux.org/man/tar.5.en )

So anything that will write an absolute path there, including literally opening it in a text editor and replacing the path by hand because that whole header is just fixed length ASCII with null terminated strings.

(I mean I assume the tar(1) command can do it too but you don't need that, the format is dead simple, if weird.)


It's a good exercise to open one of these files in hexdump or something to get a feel for what's really going on inside... but yeah, GNU tar has -P / --absolute-names to just create them with leading slashes.


Yeah, then I just hope I get the `mv` or `cp` right when I'm done, and don't end up with a directory full of files from the top part of the tar...


Any time you see a command starting with “sudo” and with a path of “/“ in the arguments, alarm bells should be going off. Here be dragons.


Right, I would be more surprised if this didn't break your system. I never would've tried unpacking a tarball to my FS root in the first place.


Yup... tar alone can't destroy your system, you need sudo (and the appropriate parameters) for that.



What, you mean the increasingly popular install pattern

   curl ... | sudo bash
might have pitfalls? Gasp.


Well... Yes you're not wrong. But this isn't really any different from downloading an installer and entering your password to permit the install to proceed.

So I assume that you also never do that - and the downloaded installer is even worse, because you can't easily look inside the executable to determine what it does.

Shocking news : Installing software installs the software.


You shouldn't do that. At least download the installer first, have a short look at it and decide if you like to run it.

I personally avoid software whose manual suggests this type of install because I think of the developers as not that intelligent ones.


The content of the file being served can easily be switched based on the user agent accessing the file (curl vs a browser)

Instead, you shoud wget, review and execute locally or something similar.


Yeah, no, you shouldn't do that. Kinda my point in writing that, right? But it's terrifyingly commonplace and needs to stop.


I don't even understand the reasoning to do that.

Maintaining that kind of monstruosity of shell scripts that have to work on all kind of OS or linux distros must be a giant PITA compared to the simplicity of making an appimage/flatpak and a set of deb/rpm packages for the most popular distros and clear instructions for port maintainers to do the same for archlinux and BSD systems.


One should never pipe curl output into `sudo bash`, but I think you've got things quite backwards here. Barring some extreme edge cases, putting binaries in /usr/bin, icons in /usr/share/icons, config files in /etc, libraries in /usr/lib and so on, are standard across the the Linux workd, and simple utilities that are just archives of binaries, docs, and auxiliary files can easily be deployed to most distros with a common install script.

This is far, far less complex than having to maintain and distribute an entire bundled runtime environment, deal with inconsistent behavior with the local config, etc. Flatpak and AppImage have their use cases, but by no means are they simpler than just putting binaries in the right places -- they are in fact an entire additional layer of complexity.


Here's one I spotted a few weeks back, just so you know I'm not making shit up:

https://docs.waydro.id/usage/install-on-desktops (see Ubuntu section)


If it is not provided by the distribution, those things shouldn't go in /usr/bin|lib|share but in /usr/local/bin|lib|share


That's a reasonable practice, yes, if you're doing a manual direct install. In situations like this, though, I typically just write a quick PKGBUILD script so the package manager can manage it, which means pointing things at the direct /usr/* paths, not /usr/local/*.


I don't recall ever seeing this approach with sudo. Is it really becoming popular? It used to be just bash, e.g. `curl ... | bash`. But now there is nothing stopping you from putting sudo in that bash script and expecting with a high probability that NOPASSWD: is also there ;)


Indeed, if you preface the curl | bash with sudo apt install, there's no difference!


Yet we all npm install or run random binaries without a second thought.


I do neither of those things!


Yes, but someone has to do the dirty work ;)


Appreciate the sacrifice


It’s funny how tar came out first. Then cpio came out which was a lot better but tar had the momentum and it never lost it. I still find cpio more controllable and use that as a preference.

But cpio existed when I first used Unix in 1982/3


and now there's pax (since POSIX looked at both of them and shuddered :-)


My WOW with cpio is pretty elementary. I've never found anything in all my usage to shudder about. Possibly there are smarty pants usages in there that could make you shudder I don't know. But why choose smarty pants if elementary is perfectly satisfactory.


I never understood why so few people use `pax` instead of `tar`. It has always seemed more user friendly to begin with.

With pax, the `-k` option will not overwrite existing files. Also not using `-p o` or `-p e` will not preserve rights. I don't know why you would extract and preserve rights to extract binaries in a chroot, usually you would want to control the permissions instead.


The main reason is probably that tar is installed by default on 99% of Linux systems you might come across, while pax isn't...


Now that is funny, a few days ago I was downvoted to oblivion for stating the obvious: that nobody cares about POSIX anymore. And now you are telling me that all those downvoters are probably not following POSIX systems[1], either on their own desktop or their servers.

[1] pax is the recommended archive utility in the POSIX shell&utilities section while tar and cpio aren't mentionned.


Because tar is super versatile, old, and as the name implies, creates an archive. Useful for creating an archive to image something, and many times you want reproducible ownership and permissions.

Tar can do all the same things, it just requires options either on the archiving or extraction end.


This reminds me once of using gparted on a disk.

I got in and out of it a few times to make sure what I was going to do, then I launched it again (but I didn't add any arguments).

Not only didn't it complain, it defaulted to my root disk, and I ended up destroying my partition table.

Luckily the system was still running and I was able to backup everything before I shut the system down.


Never untar anything to /. In this case, inexpensive, but a very valuable lesson.


Bubblewrap will allow you to create a chroot for xbps. No root needed. If $HOME/void_chroot is a void rootfs:

https://docs.voidlinux.org/installation/guides/chroot.html#t...

       bwrap --bind $HOME/void_chroot/  /  --ro-bind /etc/resolv.conf /etc/resolv.conf  --rw-bind /home/ /home/  --proc /proc --dev /dev /bin/sh
Instead of xchroot, use bwrap.


I'm kind of annoyed that tar doesn't back up directories starting with a period unless you throw in a command flag.

I wish I knew this before taking a backup and reformatting.

Always test your backups I guess..


This is an interesting one. What was the version of tar? Was it gnu tar? I wonder, how did you create that archive?

I'm asking this because I was trying to reproduce the same situation and everything seems to be working fine:

  $ tar -C /tmp/root3 -cvf test.tar .
  ./
  ./.config/
  ./.config/test
  ./.local/
  ./var/
  ./var/db/
  ./var/db/xbps/
or even like this

  $ tar -cvf test2.tar root3/
  root3/
  root3/.config/
  root3/.config/test
  root3/.local/
  root3/var/
  root3/var/db/
  root3/var/db/xbps/
No special options were needed. But, if I do it this way, then, there are definitely missing all dot directories:

  $ tar -cvf test2.tar root3/*
  root3/usr/
  root3/usr/bin/
  root3/usr/bin/xbps-uunshare
But this problem is not a tar's problem. That's only because "*" mask doesn't match dot files:

  $ echo root3/*
  root3/usr root3/var
For that purpose, you need to clearly add a dot:

  $ echo root3/.*
  root3/.config root3/.local
Thus, the solution might be

  $ tar -cvf test2.tar root3/.* root3/*
  root3/.config/
  root3/.config/test
  root3/.local/
  root3/usr/
  root3/usr/bin/
But, I'd rather stick to "-C dir/" option instead of relying on "*" mask in this case.


   tar -cvf mytar.tar /home/myuser/* 
Was definitely the syntax I ran. I'm on FreeBSD so just tar.


Then, this explains why. BTW, this is an expected Shell behavior, specified in the POSIX standard:

  If a filename begins with a <period> ( '.' ), the <period> shall be explicitly
  matched by using a <period> as the first character of the pattern or immediately
  following a <slash> character. The leading <period> shall not be matched by:
   * The <asterisk> or <question-mark> special characters
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

Interesting fact about the FreeBSD tar's origins:

  GNU tar was included as the standard system tar in
  FreeBSD beginning with FreeBSD 1.0.
  This is a complete re-implementation based
  on the libarchive(3) library. It was first released with
  FreeBSD 5.4 in May, 2005. 
https://man.freebsd.org/cgi/man.cgi?query=tar

My personal journey with FreeBSD began with version 5.3 (the first stable release on the 5th branch) in November 2004. I was completely unaware of such a significant tar change, and apparently I didn't care at the time. However, the entire 5th branch has been so revolutionary compared to the 4th branch that this change is just a drop in the bucket ;)


If I'm reading this right, using --no-overwrite-dir should have prevented this, I think.

I'd still extract to a local folder as nonroot, though, so still the right conclusion.


--no-overwrite-dir (Preserve metadata of existing directories)

This is a really good point! It solves the main problem. I'll add it to the article.

Although that's what, unfortunately, is set by default (at least in GNU tar):

--overwrite-dir (Overwrite metadata of existing directories when extracting (default).)


Been there. That was an interesting day.


ah the joys of tar...the one time I got the input and output args confused and I lost a lot of data woops


Alternative title: How to destroy your OS by not understanding your OS.

Don't take this the wrong way. Nice write up. :)

......

Sudo ... why would you sudo a command that you don't understand?

RTFM ... all of the behaviour you experienced is well documented. There are no surprises.

POSIX and UNIX utils are flexible and powerful. This is why we love them. Don't blame the hammer.

Sandbox anything you extract and review it. Extract in a test environment. Extract without sudo and copy the files and permissions you need.

tar -tv is simply tests the archive and outputs the paths. You ignored what it told you - that it would overwrite ./ :)

#### Never ever do this! $ sudo tar -C / -xvfp xbps-static-latest.x86_64-musl.tar.xz

Insanity.

Cheers :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: