Thursday, December 28, 2023
HomeiOS Developmentios - TrueDepth Digital camera Pixel Distance Inaccuracies

ios – TrueDepth Digital camera Pixel Distance Inaccuracies


I am leveraging the front-facing TrueDepth digital camera together with Imaginative and prescient to acknowledge factors within the picture and run some measurements. I perceive Imaginative and prescient coordinates are normalized, so I am changing the Imaginative and prescient normalized factors to CGPoints similar to the View, then trying to match these to the depthData in dataOutputSynchronizer to get the z worth. Then utilizing the digital camera intrinsics I am trying to get a distance between 2 factors in 3D house.

I’ve efficiently discovered the factors and (I imagine) transformed them to display factors. My pondering right here is that these CGPoints can be no completely different than if I tapped them on the display.

My problem is that despite the fact that the transformed CGPoints stay largely comparable (my hand does transfer round just a little throughout testing however stays largely planar to the digital camera ) and I am trying to calculate the depth place in the identical manner, the depths could be wildly completely different – particularly level 2. Depth Level 2 appears extra correct by way of calculated distance (my hand is about 1 foot from the digital camera) however it varies rather a lot and nonetheless will not be correct.

Here’s a console print with related information

there are 2 factors discovered
acknowledged factors
[(499.08930909633636, 634.0807711283367), (543.7462849617004, 1061.8824380238852)]
DEPTH POINT 1 =  3.6312041
DEPTH POINT 2 =  0.2998223

there are 2 factors discovered
acknowledged factors
[(498.33644700050354, 681.3769372304281), (602.3667773008347, 1130.4955183664956)]
DEPTH POINT 1 =  3.6276162
DEPTH POINT 2 =  0.560331

Right here is a number of the related code.

dataOutputSynchronizer

func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer,
                                didOutput synchronizedDataCollection: AVCaptureSynchronizedDataCollection) {
        
        var handPoints: [CGPoint] = []
        
        // Learn all outputs
        guard renderingEnabled,
            let syncedDepthData: AVCaptureSynchronizedDepthData =
            synchronizedDataCollection.synchronizedData(for: depthDataOutput) as? AVCaptureSynchronizedDepthData,
            let syncedVideoData: AVCaptureSynchronizedSampleBufferData =
            synchronizedDataCollection.synchronizedData(for: videoDataOutput) as? AVCaptureSynchronizedSampleBufferData else {
                // solely work on synced pairs
                return
        }
        
        if syncedDepthData.depthDataWasDropped || syncedVideoData.sampleBufferWasDropped {
            return
        }
        
        let depthPixelBuffer = syncedDepthData.depthData.depthDataMap
        guard let videoPixelBuffer = CMSampleBufferGetImageBuffer(syncedVideoData.sampleBuffer) else {
            return
        }
        
        // Get the cameraIntrinsics
        guard let  cameraIntrinsics = syncedDepthData.depthData.cameraCalibrationData?.intrinsicMatrix else {
            return
        }
        
        let picture = CIImage(cvPixelBuffer: videoPixelBuffer)
        
        let handler = VNImageRequestHandler(
           cmSampleBuffer: syncedVideoData.sampleBuffer,
           orientation: .up,
           choices: [:]
         )
        
         do {
           strive handler.carry out([handPoseRequest])
           guard
             let outcomes = handPoseRequest.outcomes?.prefix(2),
             !outcomes.isEmpty
           else {
             return
           }

            var recognizedPoints: [VNRecognizedPoint] = []

             strive outcomes.forEach { statement in
               let fingers = strive statement.recognizedPoints(.all)

               if let middleTipPoint = fingers[.middleDIP] {
                 recognizedPoints.append(middleTipPoint)
               }

               if let wristPoint = fingers[.wrist] {
                 recognizedPoints.append(wristPoint)
               }
             }

             // Retailer the Factors in handPoints if they're assured factors
             handPoints = recognizedPoints.filter {
               $0.confidence > 0.90
             }
             .map {
               // Alter the Y
               CGPoint(x: $0.location.x, y: 1 - $0.location.y)
             }
             
             // Course of the Factors Discovered
             DispatchQueue.predominant.sync {
              self.processPoints(handPoints,depthPixelBuffer,videoPixelBuffer,cameraIntrinsics)
             }
         } catch {
             // Be extra swish right here 
         }
    }

Course of Factors

func processPoints(_ handPoints: [CGPoint],_ depthPixelBuffer: CVImageBuffer,_ videoPixelBuffer: CVImageBuffer,_ cameraIntrinsics: simd_float3x3) {

        // This converts the normalized level to display factors
        // cameraView.previewLayer is a AVCaptureVideoPreviewLayer inside a UIView
        let convertedPoints = handPoints.map {
            cameraView.previewLayer.layerPointConverted(fromCaptureDevicePoint: $0)
        }
       
        // We'd like 2 hand factors to get the gap 
        if handPoints.rely == 2 {
            print("there are 2 factors discovered");
            print("acknowledged factors")
            print(convertedPoints)
            
            let handVisionPoint1 = convertedPoints[0]
        
            let handVisionPoint2 = convertedPoints[1]
            
            let scaleFactor = CGFloat(CVPixelBufferGetWidth(depthPixelBuffer)) / CGFloat(CVPixelBufferGetWidth(videoPixelBuffer))
            
            CVPixelBufferLockBaseAddress(depthPixelBuffer, .readOnly)
            let floatBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(depthPixelBuffer), to: UnsafeMutablePointer<Float32>.self)
            
            let width = CVPixelBufferGetWidth(depthPixelBuffer)
            let top = CVPixelBufferGetHeight(depthPixelBuffer)
            
            let handVisionPixelX = Int((handVisionPoint1.x * scaleFactor).rounded())
            let handVisionPixelY = Int((handVisionPoint1.y * scaleFactor).rounded())
            
            let handVisionPixe2X = Int((handVisionPoint2.x * scaleFactor).rounded())
            let handVisionPixe2Y = Int((handVisionPoint2.y * scaleFactor).rounded())
            
            CVPixelBufferLockBaseAddress(depthPixelBuffer, .readOnly)
            
            let rowDataPoint1 = CVPixelBufferGetBaseAddress(depthPixelBuffer)! + handVisionPixelY * CVPixelBufferGetBytesPerRow(depthPixelBuffer)
            let handVisionPoint1Depth = rowDataPoint1.assumingMemoryBound(to: Float32.self)[handVisionPixelX]
            
            print("DEPTH POINT 1 = ", handVisionPoint1Depth)
            
            let rowDataPoint2 = CVPixelBufferGetBaseAddress(depthPixelBuffer)! + handVisionPixe2Y * CVPixelBufferGetBytesPerRow(depthPixelBuffer)
            let handVisionPoint2Depth = rowDataPoint2.assumingMemoryBound(to: Float32.self)[handVisionPixelX]
            
            print("DEPTH POINT 2 = ", handVisionPoint2Depth)
            //Int((width - touchPoint.x) * (top - touchPoint.y))
}

In my thoughts proper now I am pondering my logic for locating the right pixel within the depth map is inaccurate. If that’s not the case, then I am questioning if the information stream is out of synch. However truthfully, I am just a bit misplaced in the intervening time. Thanks for any help!



Supply hyperlink

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments