Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Structure Sensor: Capture the World in 3D (kickstarter.com)
133 points by hugs on Sept 17, 2013 | hide | past | favorite | 49 comments


Such a wonderful product. Also an uncommonly informative video that sent my mind going at 826 miles per hour. If their product is a window to the future I looked out and saw, in 20 years:

Loads of people walking around with even better 3D sensors stuck somewheres on their head. The devices are networked (a feature which has shown the ability to save lives in a number of tense evacuation conditions) and upload to a common database. Algorithms have sufficiently numerous snippets to calculate optimal stitching. With haptic VR you really can go to a place without leaving your home. Clever people are also devising agents that go out and calculate optimal itineraries for busy travelers. There is a market in there to keep top routes relatively clear.

A handful of companies vie to be gatekeepers to this data. I think Google is amongst them. In some timelines people are micro-paid cataloguers and explorers. Most people get a tiny amount for just walking around and sharing. Some teams are very skilled at this and get paid well for their unique 3dgrapher experiences. Tombs and such had labor supply so high, they were cheaply and quickly mapped out. Level design became a fair bit easier. Mostly boring places left to map. So dull, pay is decent for walking around thoroughly in them.

The 3D sensors are good enough that facial recognition beats anything a human can do. People get micro-currency kickbacks if they let their expression geometry and infradata be tracked as they visit various stores. The prior mentioned gatekeepers use location and networked devices to track/monitor your face and infer emotions as you browse assorted merchandise. Main individual utility is mostly that no one ever gets lost anymore and all users know where everything is. No matter this is their first visit. Government expands funding into denser, more reliable storage. Everyone benefits. Laws really need to take into account that as soon as you leave the house, your every step is tracked. Everyone is hence at least a minor criminal.

Laptops and such are also fitted with similar more precise sensors so Ninja Bull Growth Hackers can track every minute subsecond flicker of expression, every heartbeat and temperature fluctuation as you browse whatever the current iteration of social media photo sharing is. Opt-in by default, setting up micropayments is a tad tricky.


This is one of those things I'm betting that after a few years and some simplifications, every new high-end smartphone will have as a standard feature.


So this is a topic I have thought about for awhile -- I released a 3D scanning app for iPhone nearly 3 years ago (uses the screen as the light source for scanning, called Trimensional) and have had many conversations about when Kinect-style sensing would become standard.

If we just take Apple as an example, for them to include it standard, it would have to really justify its existence. (I realize many people don't care about Apple, but I choose Apple because, for example, there have been stereo cameras on various Android models over the years, but since developers can't really count on the presence of a stereo camera for a large number of users, nothing game-changing ever really came of it, even though stereo cameras enable 3D sensing.)

Though user interaction and 3D acquisition are amazingly cool, I don't think Apple would ever include a depth sensor just for those reasons. However, Apple has shown it cares pretty deeply about the iPhone as a camera and is even willing to add new hardware if it helps people take better photos, the dual LED flash being the latest example of this.

So, as crazy as it sounds, I think Apple could add a back-facing depth sensor in 2-3 years mostly in order to allow for automatic depth-based post-processing of photos, mostly for better-looking lighting, all without the user having to care that there was 3D sensing involved. All the other stuff will then become possible, and widespread, as a side-effect.

EDIT: On the Android front, the PrimeSense folks had a booth set up at CVPR this year and mentioned that at least one manufacturer was planning to include their Capri sensor (a miniature Kinect) in an Android tablet some time soon.


I think 2 things need to happen for depth sensing to go mainstream (and I do believe they will happen sooner or later):

- Games. Look at the tennis ball example on the original article. Remember Google's IRL photo-based game (Ingress), now just imagine that with fine-grained 3D (Google will use it to crowd-source a centimeter-model of the earth). Games like this remain a bit of a niche, but just imagine if someone makes a massive social game out it on FB (a cross between Farmville, Sims, and Minecraft, projected onto your real world). Of course, someone could also create a shocking IRL FPS game (imagine your kids pointing this out the window in traffic and "shooting" at people and cars to watch them blow up). Finally something to use the processing power in theses little phones and tablets.

- 3D photography. I think this is the future of photography. Take a picture of something, extract the spacial data from the image, modify it/change the p.o.v. Recall the recent image/object maniupulation video (the SIGGRAPH one that used the PatchMatch algorithm to fix the background). Each photo becomes a mini-scene that you can navigate around (kinda like the "frozen" 360 degree pans in the Matrix). Next step is time-dimension, in other words 3D immersive movies where the viewer can move around almost anywhere while the movie unfolds. You can guess the first industry to adopt this...

Both of these can currently be done with flat images, processing power, and some human guidance. With depth sensing it can be faster, automated, and more accurate.


I imagine depth sensing will be more useful for Glass-like devices that can really benefit from hands-free gesture recognition. It seems like a much better fit than cell phones.

PS -- Hi Grant! From a fellow GT Robotics alum.


Very true, and I'm glad that approach is being actively pursued by meta (http://www.kickstarter.com/projects/551975293/meta-the-most-...).

I also like the Oculus Rift hack that Occipital showed briefly here -- it makes a lot of sense for augmented/virtual reality, though I guess it means you can only interact with things when looking at your hands, as opposed to Sixense's controller-based approach.

PS: Hi, Travis!


> there have been stereo cameras on various Android models over the years

Hm. I now envision a (physical) addon that has a two mirrors, like a periscope, and fits over the end of the phone, and the "face" camera -- allowing the two cameras to be used for stereo imaging... Not sure if you'd be able to do just as well with motion sensor data coupled with just the main camera -- and having the user move the phone from side to side (using parallax for depth sensing) ... would make for an interesting algorithmic problem at least.


Nope.

How often do you need to scan a 3D model of anything?

I'm sure it will be popular with the creative people. It's possible it will get popular in some niche markets, like real estate agents taking 3D shots of the apartments they offer (though not much hope here, they still fail to take decent photos even now).


How often do you need one of those auto-ma-whatsits anyway? They're really only good for a few blocks of downtown where the streets are flat cobblestone. They're useless on most trails. Just get a higher resolution horse and be happy with it.


> How often do you need one of those auto-ma-whatsits anyway?

Every day.


If you want to buy, say furniture, a big barrier of web shopping is that you cannot automatically reason about their dimensions, because they are not exact enough (and even more seriously, they are usually available only as free text).

The 3D scanning technology would enable products that are add-ons to objects where the original manufacturer did not see value in adding an interface. If I can scan my car interior, I can attach my GPS to nearly any surface "perfectly". This gives aftermarket products very nice finish with a much lower cost.


You can even imagine previewing the furniture "live" in your apartment. I've been thinking about this for a while and Ikea made the first step with their 2014 catalogue. Ikea's version is pretty limited though, but that's a first step.


How often do you need to take panoramic pictures? Not that often, but it's fun and gimmicky enough that it's on the iPhone.

If (when) we can make a cheap and small sensor that can provide depth data to the camera to make cool looking 3D pictures, it'll likely find its way on smartphone cameras.

In 2050, will our pictures still be 2D arrays of RGB pixels?


Perhaps more to the point, how often do you "need" to take pictures at all? Smartphone cameras have proliferated even though most people don't "need" to be able to take pictures of their food or whatever.


Panorama thing is a software solution, it doesn't require any additional hardware.


The first 30 seconds or so of this video are particularly impressive: https://www.youtube.com/watch?v=JmgRdFQOLPw

I can see this dramatically reducing the amount of time it takes to build 3D environments for games once the tech has had time to mature.

Imagine fitting a fleet of drones with this tech and sending them out exploring...


> Imagine fitting a fleet of drones with this tech and sending them out exploring...

When I saw this submission, this was the first thing I thought of: http://www.youtube.com/watch?v=IxmJT5xT5rQ

From related 3D scanning videos, it looks like 3D scanning is already a pretty common operation, e.g. in mining and navigation. Putting the technology into consumer-facing mobile devices... wow.


Here's a cool example of a shipping museum that was 3D scanned at high resolution:

http://www.youtube.com/watch?v=gDTbFhFZl9I


One potential application that I'd love to see worked out with this kind of data is to infer and separate lighting and material data from an image (which would require algorithms to guess the original light source(s)) so that artificial light sources can be applied arbitrarily. Imagine taking a picture with a simple flash, and then using software to create whatever dramatic lighting effect you want to generate a new image.


Have you seen crazybump?

There is a good tutorial on it at http://www.blenderguru.com/videos/the-secrets-of-realistic-t...

It is a program that takes a single image and generates diffuse/normal/occlusion/specular/displacement textures from it in such a way as can be easily used as a material in blender.

As you can see from the video, it can work amazingly well given the limited amount of input information.


I hadn't seen that. So if you can shoot your photos in very flat light, you could get close the result I'm talking about with very little additional code from what already exists (it would help to reproject the image as a single "texture" as it would wrap around the 3D data). What I don't know of is an algorithm that "de-lights" an image, essentially bringing it to flat matte lighting, regardless of the original data.

Obviously that's something that involves multiple stages, and I can think of three. The first is removing gradients from diffuse shading. The second is lightening shadows to match their surroundings (shadows, incidentally, could probably be used in combination with the 3D structure to infer light sources). Finally, you'd need to identify specular highlights and inpaint them. You might also have to use inpainting in stage 2, in order to deal with full-black shadows.


I saw a video demonstrating something like this on YouTube a couple of years ago, but I don't recall any of the key words to search for.


The future is nigh.

I see this sensor as something that has an immense number of valuable uses to everyone, as well as creating an immense number of potential abuses.

Whenever I see new technology I think of the Mom and Pop shops, run by people who aren't a part of the technology movement and how it would effect them. And when I think of that shop and this device, I think of people being able to scan the store and not just to get a video image of the layout and camera locations, but a properly rendered floor plan of the store where one could pinpoint unsurveilled merchandise.

I also see a lot of regulation coming from this. Imagine a world where anyone can go to a store, pick up an object, scan it at the store and then go home and print it on a 3-d printer. It might require theft to be redefined and create a slippery slope for people who like to take pictures of things with their iPads for creative inspiration.

I'm not trying to argue against this technology by any means, but with every new technology that comes out, a lot of questions are raised for me.

I guess the issue for me is more focused towards humanity rather than the technology. It's probably a nonsensical concern as there's no way to know if we'd really be better or worse off with/without any technology that has been developed and disseminated. But like Uncle Ben/Voltaire said, ~"With great power comes great responsibility."

With the advancements in technology and their increasing acceleration, we are being spoon fed an increasing amount of power. My concern is whether or not we are responsible enough for it. This sensor looks amazing, and I can't wait to see the great uses for it but like every new technology that completely amazes me, it reminds me: The future freaks me out.


> I also see a lot of regulation coming from this. Imagine a world where anyone can go to a store, pick up an object, scan it at the store and then go home and print it on a 3-d printer.

Brick and mortar stores will cease to exist long before this technology becomes good enough to do what you've described. This technology is currently pretty awful in terms of accuracy. Btw, what would you actually be printing? A plunger? Easier to 3d model. A clock? You'll never measure it accurately enough. Some kind of cheap coat hook? Not sure the store is surviving off of cheap coat hook margins in the first place.

Btw, these kinect-like cameras work by measuring the displacement of infrared dots, so there exists huge discrepencies in first the infrared dot projector, then the camera that sees them, and also the surface onto which the dots are reflecting. If you're lucky enough to get decent input data, then you have to post-process the noise out, so that flat surfaces are indeed flat, and curved surfaces have a smooth curve. But at what point is noise actually just subtle features? Imagine trying to scan a diamond-plate floor panel. The diamond design pattern would be perceived as noise, and the optimizations would attempt to normalize the data to be a flat surface.

There's a long way to go before we have to be considering the dangerous power of this kind of tech. The current danger is in 3d printed firearms, which are getting better and better every day, and which have been 3d modelled by designers in 3d programs.


I agree with you in all except this:

> The current danger is in 3d printed firearms, which are getting better and better every day, and which have been 3d modelled by designers in 3d programs.

It was probably discussed here many times, but where exactly is the danger? You can (and could have for years) make better guns with CNC milling machines and/or random metal scrap lying around your backyard, and getting ammo (which seems to be the real problem) is not getting any easier with 3D printers.


>Imagine trying to scan a diamond-plate floor panel. The diamond design pattern would be perceived as noise, and the optimizations would attempt to normalize the data to be a flat surface.

I bet you could get around this somewhat if you matched the 3D data up with a normal 2D image, processing it to figure out textures and subtle features.


You can buy several high quality, factory-made guns at any gun store for the cost of just one 3D printer, so why do you call out 3D printed guns as particularly dangerous? Unless you mean their likelihood of just exploding instead of firing properly?


Even if the accuracy gets good enough, you could just seal products in boxes.


Looks like you hack on your own sensor by buying one of these:

http://www.primesense.com/developers/get-your-sensor/


This looks like incredible core technology for mobile devices. I'm guessing it is structured light like the Kinect?


If you watch the Kickstarter video, the same people who made the tech in the Kinect worked with them on this device. (1:27 to 1:53 in the video.)


why choose an ipad? it's the most hostile environment for this. Can't plug that cable anywhere else in the world. Will be sued by apple. etc.

Not to mention that if you went linux/android you'd probably also support desktop for "free" after implemeting this over USB and writing the kernel driver.


Yeah - looks like you can unplug the Lightning cable and plug in a USB cable


You can buy the USB kit to work with any platform.


Very cool. At first I was suspicious because their kickstarter video doesn't give a really complete idea of how good their 3d models are. And I know that from the right angle, a really terrible 3d model can look very good.

However this video shows that their 3d reconstruction is actually pretty good:

https://www.youtube.com/watch?v=w4aMQQv2Zvk

I think that an open platform like theirs, which already includes sample software that does good 3d reconstruction, is very good for the field. Much more interesting than projects that combine hardware and software so that only the makers can experiment with new software algorithms.

EDIT: another video showing it "really working"

https://www.youtube.com/watch?v=JmgRdFQOLPw


Also 'Structure Sensor/Oculus Rift Hack'; https://www.youtube.com/watch?v=stwjUHooYcA


I'm not sure if I can recall a time being as excited about a mobile accessory. Seems like a smart team that is making an honest effort to establish a successful product (even if this may well be a niche market).

Hopefully this gets funded, looking forward to getting my hands one one!


Sorry to be that guy, but I can't help but think of the excitement that surrounded the Leap Motion sensor, and the disappointment that turned out to be. I really hope this turns out to be even better than it's promo video!


The price is surprising. Consider this $200 Creative Depth Webcam which comes out Oct 1st.

http://us.store.creative.com/Creative-Senz3D-Depth-and-Gestu...

I imagine that will be quickly hacked to do much of what this structure sensor wants to do.

Edit: Date, they slipped from the 25th.


Is the Structure Sensor actually doing all of the SLAM computation on-board, or is it just shipping the RGBD images off to the iPad?


The latter. Vision algorithms run on the iPad.



This might be interesting for video game designers and animators - instead of painstakingly creating a model in a modeling program, they use clay and scan it.


How does it work?



Lasers.

From the video:"Structure works by capturing a pattern of invisible, laser projected light, which allows you to measure thousands of distances, all at once."


That's a little misleading. The structured light pattern is indeed generated by a laser, but the distances being measured are not distances to the object, they're displacements of a structured pattern (eg, predictable dots) seen on the object with an infrared camera.


It appears to be the exactly same as the Kinect.


uau! that's really something! looking forward to it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: