You can shoot a ray from the camera into the scene, and the object it collides with gets the label (if you want to show more labels at once, you could fire several parallel rays).
Then, you need to do a complicated and counterintuive trick where the world space coordinates are translated to a screen space coordinates. Maybe someone can fill in. It‘s the same principle that is needed for mouse pointers, just in reverse.
When you have the screen space coordinates, you can calculate an offset, and connect both with a “line-draw” component and action, and add the label to one of the points.
These are just some pointers, hope it helps.