Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Unicorn – The Ultimate CPU Emulator (unicorn-engine.org)
109 points by tosh 1 day ago | hide | past | favorite | 30 comments
 help



The problem is that it's not sustainable - QEMU improved so much since the moment of fork and updating the QEMU code in Unicorn is always done manually. It is especially important for architectures that evolve quickly - ARM64, RISC-V, x86. Meanwhile, QEMU now has the notion of TCG plugins[1] that can read/write registers and memory, which is enough for most cases. You can see many examples of the plugins in contrib/plugins[2] directory of the mainline QEMU - a good starting point.

[1] https://www.qemu.org/docs/master/devel/tcg-plugins.html

[2] https://gitlab.com/qemu-project/qemu/-/tree/master/contrib/p...


This is actually the whole reason I wrote the patches that allow you to read and write memory and registers. I work on fuzzing, and fuzzing tools are a fragmented ecosystem of QEMU forks and patches that are outdated the moment they are published. Even PANDA from MIT LL which has great support struggled to keep their patches rebased and compatible with QEMU's actually-pretty-fast releases. Upstream or bust, it's really not that hard, it just takes a little persistence (and with LLMs learning git email is easy)!

This looks useful for a lot of instrumentation use cases, but less so for building custom emulators, if I'm understanding it correctly.

Plugins would definitely be the wrong tool for the job for actual new emulator development, but if you needed to do something easy like add an NVRAM for a router rehost you could do that with plugins e.g. by skipping an instruction and using a callback to implement the desired behavior (as long as it is something a plugin is allowed to do -- you can't access device memory yet).

For this, I believe, working with the mainline is more promising path, at least to reduce the amount of changes needed[1].

[1] https://gitlab.com/qemu-project/qemu/-/work_items/1896


For anyone who isn't familiar with Unicorn, it doesn't emulate any specific whole-system, it's a library/framework for emulation of just the CPU. You are responsible for hooking up the whole "rest of the world" to the emulated CPU, for whatever you might need. This includes things like emulating peripherals, syscalls, binary loading, etc.

You usually use it to build your own emulator or other analysis tool, often for reverse engineering.


Somewhat relatedly, is there something halfway between QEMU and Unicorn? That is, a full VM in a library, with debugging capabilities. I'd like to be able to configure a VM, save the execution at a specific point, modify memory, run, and stop when some condition is hit (e.g. a memory address is read, or executed). For years I've had this idea of running the Jamella editor in multiple threads to crack Diablo II item seeds.

I use Qiling [0] (built on top of Unicorn) sometimes for this kind of things (it can take application snapshots, that you can restore; and you can also use something similar to x86/x86-64 memory hardware breakpoints too). Might fit what you want, although it can sometimes be in a pain in the rear to set up...

[0] https://github.com/qilingframework/qiling


Sweet, thanks. It doesn't seem to be exactly what I'm looking for, in that it simulates (replaces) the OS instead of hosting it, but it's still interesting.

Maybe kinda sorta https://github.com/momo5502/sogen? It can even virtualize Modern Warfare 2 these days.

Well, there's ptrace/gdb? (Since you mentioned Diablo II, you might want a windows debugger, but same idea)

Well, the program doesn't really work anymore, hence why I want a VM.

If it runs in Wine, you can use winedbg

I was just looking at Unicorn last week because it's used by unipacker to do automated unpacking of binaries. I built a "toolbox" for gpt-5.5 to do semi-automated malware and exploit reverse engineering and unipacker is sometimes useful for that purpose.

I’m using it a lot in AI-driven reverse-engineering (old DOS games), agents love it (usually Python harness)

Codex just walked me through my first experience with unicorn the other day, emulating / stubbing out subsystems from a Pioneer CDJ-3000 to help understand its music catalog database format and network protocol.

It felt like science fiction watching Codex write unicorn to host binaries and reverse engineer them.


"Based on Qemu 5, we built Unicorn2 from scratch"

What?


Unicorn is more like a fork of Qemu than something that depends on it. So what they're saying here is, they reimplemented the Unicorn feature-set (which dramatically diverges from the Qemu feature-set) from scratch based on a newer Qemu branch.

uh.. what is a cpu emulator? or what can I do with it? I am kind of having hard time comprehend this.

Low-level debugging, older games (so many consoles have used everything from MIPS to PowerPC as CPUs), etc.

In the early 2000s, I used a linux-based emulator to virtualize some ancient manufacturing hardware control software that was still running on EOL and very expensive PA-RISC kit. It saved the company tens of thousands of dollars in new hardware, while also running faster (it involved early 1990s-era proprietary vector graphics as part of it was printing on the goods). The HP sales people were not amused and tried very hard to get my 22 year old self fired, but my manager convinced them to use it and the old hardware as a backup for awhile. Last I heard in 2011 it was still being used, though running in linux on VMware.


An emulator is a computer program that executes the machine code of some system. For example, if your computer is x86, you can't natively run ARM machine code. But an emulator can.

QEMU is an emulator that can run entire operating systems, because it emulates hardware devices like hard drives and displays. Unicorn doesn't emulate any of those things, it only emulates the CPU. It's probably mostly useful for compiler development and security research / reverse engineering.


This comparison to qemu gives some idea: https://www.unicorn-engine.org/docs/beyond_qemu.html

The ability to execute and inspect some code without any context (no OS, not even a complete binary) is useful for reverse/security engineering.


Well, say all you've got is an x86 device, but you want to develop for ARM. You can write and compile your code, push it to unicorn, and see how it runs.

Or you can use it as a sandbox serving x86 software on an x86 machine.

Or as a "virtual machine" serving say AOSP for ARM on a Windows x86 host.

There's a long list of projects using Unicorn at https://www.unicorn-engine.org/showcase/


How's this one differ from QEMU?


It can be used for many things. But the main use is reverse engineering.

Main use for consumers / tinkers. In industry, the main use is during pre-silicon development. You emulate the target processor and model the periphs and you have a complete virtual representation of your new SoC/MCU/etc before the hardware is available. Benefit being once you do get the hardware, you already have the entire software stack nearly ready to go and already tested.

This. It is far easier to debug something like obfuscated DRM code when you have it running inside an emulator and can wind the code forwards and backwards and see the whole machine, rather than trying to debug it on the actual hardware where your options are more limited.

> Based on Qemu 5, we built Unicorn2 from scratch, […] still maintaining backward compatibility with the current version, […] we also added 2 highly-demanded architectures in PowerPC & RISCV.

Qemu supports RV and PPC!

And that is not what “from scratch” means!


I think they re-implemented Unicorn on a newer version of QEMU 5, rather than trying to port their old modifications over. Unicorn apparently did not support RV and PPC in the older version.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: