Attribute source camera #63

New Issue

Benjamin_Loison · 2024-04-30T04:11:03+02:00

Benjamin_Loison commented

2024-04-30 04:11:03 +02:00

Following #59.

Benjamin_Loison added the

enhancement

high priority

medium

labels 2024-04-30 04:11:03 +02:00

Benjamin_Loison commented

2024-04-30 04:14:44 +02:00

Considering same number of images for both groups and consider half for testing and up to the other half for training. Just a 2D curve seems enough to represent the accuracy, y-axis being accuracy and x-axis being the number of images for training and we expect the curve to increase and preferably reach an accuracy of 100% while starting at 50%.

I initially thought about a 2D heatmap, as shown in the Stack Overflow answer 33282548. Maybe it makes sense to consider multiple images in testing set that we know were taken with a given device but we do not know for sure if it is the one associated to some retrieved images, see #61 and #8. Let us first investigate with one testing image.

What execution time can we expect from such an algorithm? As the longest iteration takes about 4 minutes and 28 second, for 50 executions we can expect about 4 hours of execution.

Considering same number of images for both groups and consider half for testing and up to the other half for training. Just a 2D curve seems enough to represent the accuracy, y-axis being accuracy and x-axis being the number of images for training and we expect the curve to increase and preferably reach an accuracy of 100% while starting at 50%. I initially thought about a 2D heatmap, as shown in [the Stack Overflow answer 33282548](https://stackoverflow.com/a/33282548). Maybe it makes sense to consider multiple images in testing set that we know were taken with a given device but we do not know for sure if it is the one associated to some retrieved images, see #61 and #8. Let us first investigate with one testing image. What execution time can we expect from such an algorithm? As the longest iteration takes about 4 minutes and 28 second, for 50 executions we can expect about 4 hours of execution.

Benjamin_Loison commented

2024-04-30 04:26:16 +02:00

Will we face normalization issue as do not know what normalization to apply to a testing image as we do not know the associated device in theory?

Well can apply a single normalization for both image types.

Will we face normalization issue as do not know what normalization to apply to a testing image as we do not know the associated device in theory? Well can apply a single normalization for both image types.

Benjamin_Loison commented

2024-04-30 04:50:40 +02:00

319ca8fb60d64793a6e17ce5993d1a6648f2c49b

[319ca8fb60d64793a6e17ce5993d1a6648f2c49b](https://gitea.lemnoslife.com/Benjamin_Loison/Robust_image_source_identification_on_modern_smartphones/commit/319ca8fb60d64793a6e17ce5993d1a6648f2c49b)

Benjamin_Loison referenced this issue from a commit

2024-04-30 05:01:09 +02:00

#63: WIP

Benjamin_Loison referenced this issue from a commit

2024-04-30 05:31:29 +02:00

WIP (#63)

Benjamin_Loison referenced this issue from a commit

2024-04-30 05:37:00 +02:00

Move denoising testing images to work even if provide `minColor` and `maxColor` (#63)

Benjamin_Loison commented

2024-04-30 06:00:53 +02:00

had a good idea if I remember correctly about PRNU and distance something like convolution as we know what PRNU we expect but I do not remember the point.

``` -----BEGIN PGP MESSAGE----- hF4DTQa9Wom5MBgSAQdALrGkUSvzmRCdtuCMoP/kG+tLAZmg1W+Mh9PtT97MdnMw 0avQk4sh+my92MUq2XRFmZG+3zzGRKPR463SWlbKyHHyhW2AlJ/oG734R6qz/7Rp 0kEBvJriF0RCrLFZFGqicfzCcPyLz36lPKcXi0pR64ZcbEMUrZ+IB902TI9MMn6u 1sAskjQ9U6F03y0jgZpQUHrW8Q== =e5is -----END PGP MESSAGE----- ``` had a good idea if I remember correctly about PRNU and distance something like convolution as we know what PRNU we expect but I do not remember the point.

Benjamin_Loison commented

2024-04-30 06:35:38 +02:00

Debugging always guessing Rafael by just providing a set of 2 images, one training and one testing being the same one.

Benjamin_Loison referenced this issue from a commit

2024-04-30 06:48:21 +02:00

#63: Add debugging

Benjamin_Loison commented

2024-04-30 11:59:30 +02:00

In theory Rafael images are more controlled, so less noise and RAISE different noise so not similar.

Normalization across cameras mey be incorrect if do not sample in general in same range.

In theory Rafael images are more controlled, so less noise and RAISE different noise so not similar. Normalization across cameras mey be incorrect if do not sample in general in same range.

Benjamin_Loison pinned this 2024-05-01 20:22:49 +02:00

Benjamin_Loison commented

2024-05-01 20:23:06 +02:00

To test faster directly evaluate the whole dataset instead of an increasing one.

Benjamin_Loison commented

2024-05-03 00:15:07 +02:00

I have the feeling that wavelet estimated PRNU for one camera is not as expected. Note that no matter the random order to compute the estimated PRNU by averaging, it results in the same result, so we should have an identical estimated PRNU.

Well in fact we have to consider 50 random images for the training.

Thanks to determinism (see #24) final imagesCamerasFileNames is deterministically determined, I verified this. Hence, can provide these image names to extract_noise.py to have estimated PRNU to compare with:

 imagesFileNames = os.listdir(imagesFolderPath + ('/png' if raiseNotFlatFields else ''))
+imagesFileNames = ['flat_022.NEF', 'flat_008.NEF', 'flat_053.NEF', 'flat_034.NEF', 'flat_048.NEF', 'flat_021.NEF', 'flat_091.NEF', 'flat_097.NEF', 'flat_077.NEF', 'flat_019.NEF', 'flat_032.NEF', 'flat_046.NEF', 'flat_037.NEF', 'flat_085.NEF', 'flat_061.NEF', 'flat_093.NEF', 'flat_010.NEF', 'flat_100.NEF', 'flat_086.NEF', 'flat_017.NEF', 'flat_004.NEF', 'flat_071.NEF', 'flat_013.NEF', 'flat_020.NEF', 'flat_079.NEF', 'flat_023.NEF', 'flat_074.NEF', 'flat_049.NEF', 'flat_047.NEF', 'flat_030.NEF', 'flat_015.NEF', 'flat_018.NEF', 'flat_003.NEF', 'flat_016.NEF', 'flat_040.NEF', 'flat_082.NEF', 'flat_062.NEF', 'flat_029.NEF', 'flat_089.NEF', 'flat_033.NEF', 'flat_058.NEF', 'flat_043.NEF', 'flat_056.NEF', 'flat_088.NEF', 'flat_014.NEF', 'flat_001.NEF', 'flat_009.NEF', 'flat_007.NEF', 'flat_054.NEF', 'flat_066.NEF']

Using [:50] on the following.

To be precise here are the given image names:

RAISE ['flat_022.NEF', 'flat_008.NEF', 'flat_053.NEF', 'flat_034.NEF', 'flat_048.NEF', 'flat_021.NEF', 'flat_091.NEF', 'flat_097.NEF', 'flat_077.NEF', 'flat_019.NEF', 'flat_032.NEF', 'flat_046.NEF', 'flat_037.NEF', 'flat_085.NEF', 'flat_061.NEF', 'flat_093.NEF', 'flat_010.NEF', 'flat_100.NEF', 'flat_086.NEF', 'flat_017.NEF', 'flat_004.NEF', 'flat_071.NEF', 'flat_013.NEF', 'flat_020.NEF', 'flat_079.NEF', 'flat_023.NEF', 'flat_074.NEF', 'flat_049.NEF', 'flat_047.NEF', 'flat_030.NEF', 'flat_015.NEF', 'flat_018.NEF', 'flat_003.NEF', 'flat_016.NEF', 'flat_040.NEF', 'flat_082.NEF', 'flat_062.NEF', 'flat_029.NEF', 'flat_089.NEF', 'flat_033.NEF', 'flat_058.NEF', 'flat_043.NEF', 'flat_056.NEF', 'flat_088.NEF', 'flat_014.NEF', 'flat_001.NEF', 'flat_009.NEF', 'flat_007.NEF', 'flat_054.NEF', 'flat_066.NEF', 'flat_065.NEF', 'flat_092.NEF', 'flat_038.NEF', 'flat_055.NEF', 'flat_051.NEF', 'flat_070.NEF', 'flat_099.NEF', 'flat_090.NEF', 'flat_067.NEF', 'flat_052.NEF', 'flat_080.NEF', 'flat_050.NEF', 'flat_035.NEF', 'flat_057.NEF', 'flat_078.NEF', 'flat_042.NEF', 'flat_006.NEF', 'flat_028.NEF', 'flat_060.NEF', 'flat_041.NEF', 'flat_096.NEF', 'flat_063.NEF', 'flat_012.NEF', 'flat_005.NEF', 'flat_076.NEF', 'flat_045.NEF', 'flat_039.NEF', 'flat_044.NEF', 'flat_025.NEF', 'flat_098.NEF', 'flat_059.NEF', 'flat_036.NEF', 'flat_068.NEF', 'flat_026.NEF', 'flat_069.NEF', 'flat_075.NEF', 'flat_083.NEF', 'flat_031.NEF', 'flat_024.NEF', 'flat_002.NEF', 'flat_072.NEF', 'flat_064.NEF', 'flat_081.NEF', 'flat_011.NEF', 'flat_087.NEF', 'flat_027.NEF', 'flat_084.NEF', 'flat_094.NEF', 'flat_073.NEF', 'flat_095.NEF']
Rafael 23/04/24 ['DSC03433.ARW', 'DSC03467.ARW', 'DSC03385.ARW', 'DSC03302.ARW', 'DSC03387.ARW', 'DSC03426.ARW', 'DSC03404.ARW', 'DSC03322.ARW', 'DSC03364.ARW', 'DSC03545.ARW', 'DSC03463.ARW', 'DSC03367.ARW', 'DSC03471.ARW', 'DSC03400.ARW', 'DSC03420.ARW', 'DSC03300.ARW', 'DSC03443.ARW', 'DSC03533.ARW', 'DSC03386.ARW', 'DSC03370.ARW', 'DSC03522.ARW', 'DSC03299.ARW', 'DSC03330.ARW', 'DSC03466.ARW', 'DSC03354.ARW', 'DSC03489.ARW', 'DSC03518.ARW', 'DSC03436.ARW', 'DSC03363.ARW', 'DSC03398.ARW', 'DSC03407.ARW', 'DSC03472.ARW', 'DSC03325.ARW', 'DSC03362.ARW', 'DSC03297.ARW', 'DSC03465.ARW', 'DSC03552.ARW', 'DSC03324.ARW', 'DSC03409.ARW', 'DSC03475.ARW', 'DSC03296.ARW', 'DSC03353.ARW', 'DSC03462.ARW', 'DSC03453.ARW', 'DSC03446.ARW', 'DSC03369.ARW', 'DSC03414.ARW', 'DSC03448.ARW', 'DSC03530.ARW', 'DSC03360.ARW', 'DSC03486.ARW', 'DSC03434.ARW', 'DSC03328.ARW', 'DSC03497.ARW', 'DSC03439.ARW', 'DSC03548.ARW', 'DSC03458.ARW', 'DSC03406.ARW', 'DSC03484.ARW', 'DSC03319.ARW', 'DSC03520.ARW', 'DSC03529.ARW', 'DSC03357.ARW', 'DSC03494.ARW', 'DSC03445.ARW', 'DSC03519.ARW', 'DSC03339.ARW', 'DSC03384.ARW', 'DSC03487.ARW', 'DSC03318.ARW', 'DSC03508.ARW', 'DSC03456.ARW', 'DSC03401.ARW', 'DSC03315.ARW', 'DSC03536.ARW', 'DSC03341.ARW', 'DSC03305.ARW', 'DSC03473.ARW', 'DSC03373.ARW', 'DSC03537.ARW', 'DSC03468.ARW', 'DSC03470.ARW', 'DSC03307.ARW', 'DSC03459.ARW', 'DSC03355.ARW', 'DSC03480.ARW', 'DSC03356.ARW', 'DSC03528.ARW', 'DSC03304.ARW', 'DSC03438.ARW', 'DSC03320.ARW', 'DSC03412.ARW', 'DSC03316.ARW', 'DSC03541.ARW', 'DSC03477.ARW', 'DSC03321.ARW', 'DSC03430.ARW', 'DSC03344.ARW', 'DSC03431.ARW', 'DSC03378.ARW']

Note that the training set is currently set to the first half of these 100 image names.

Training is not faster because the training set size is half the dataset one, as also have to estimate PRNU for testing set.

For RAISE:

minColor=4
maxColor=7952

Modified contrast and brightness:

Rafael 23/04/24:

minColor=676
maxColor=1960

Modified contrast and brightness:

Well in fact estimated PRNU may be a bit different as the scale is no more single camera images dependent but across cameras.

{min,max}Color is correctly computed by considering both training and testing sets.

As expected have wider scale due to taking into account both testing sets:

minColor=0
maxColor=7952

@@ -62,6 +63,7 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else
     rescaleIfNeeded = returnSingleColorChannelImage if computeExtremes else rescaleRawImageForDenoiser
     if not computeExtremes:
         print(f'{minColor=} {maxColor=}')
+        exit(2)
         print('Extracting noise of testing images')
         for camera in tqdm(IMAGES_CAMERAS_FOLDER, 'Camera'):
             for cameraTestingImageIndex in tqdm(range(numberOfTestingImages), 'Camera testing image index'):

So if one of both estimated PRNU generated by attribute_source_camera.py have an issue, it should be Rafael one as the intensity range is far narrower than RAISE.

@@ -95,7 +96,7 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else
             imagePrnuEstimateNpArray = multipleColorsImage - multipleColorsDenoisedImage
             cameraIterativeMean = camerasIterativeMean[camera]
             cameraIterativeMean.add(imagePrnuEstimateNpArray)
-            if cameraIndex == numberOfCameras - 1:
+            if cameraIndex == numberOfCameras - 1 and cameraTrainingImageIndex == numberOfTrainingImages - 1:
                 numberOfTrainingImagesAccuracy = 0
                 print(f'{numberOfTestingImages=} {numberOfCameras=}')
                 # Loop over each camera testing image folder.

to only run prediction once have estimated the PRNUs on the whole training sets.

Also note that one of both camera estimated PRNU may have a different resolution to be comparable with the other camera images.

I have the feeling that wavelet estimated PRNU for one camera is not as expected. Note that no matter the random order to compute the estimated PRNU by averaging, it results in the same result, so we should have an identical estimated PRNU. Well in fact we have to consider 50 random images for the training. Thanks to determinism (see #24) final `imagesCamerasFileNames` is deterministically determined, I verified this. Hence, can provide these image names to `extract_noise.py` to have estimated PRNU to compare with: ```diff imagesFileNames = os.listdir(imagesFolderPath + ('/png' if raiseNotFlatFields else '')) +imagesFileNames = ['flat_022.NEF', 'flat_008.NEF', 'flat_053.NEF', 'flat_034.NEF', 'flat_048.NEF', 'flat_021.NEF', 'flat_091.NEF', 'flat_097.NEF', 'flat_077.NEF', 'flat_019.NEF', 'flat_032.NEF', 'flat_046.NEF', 'flat_037.NEF', 'flat_085.NEF', 'flat_061.NEF', 'flat_093.NEF', 'flat_010.NEF', 'flat_100.NEF', 'flat_086.NEF', 'flat_017.NEF', 'flat_004.NEF', 'flat_071.NEF', 'flat_013.NEF', 'flat_020.NEF', 'flat_079.NEF', 'flat_023.NEF', 'flat_074.NEF', 'flat_049.NEF', 'flat_047.NEF', 'flat_030.NEF', 'flat_015.NEF', 'flat_018.NEF', 'flat_003.NEF', 'flat_016.NEF', 'flat_040.NEF', 'flat_082.NEF', 'flat_062.NEF', 'flat_029.NEF', 'flat_089.NEF', 'flat_033.NEF', 'flat_058.NEF', 'flat_043.NEF', 'flat_056.NEF', 'flat_088.NEF', 'flat_014.NEF', 'flat_001.NEF', 'flat_009.NEF', 'flat_007.NEF', 'flat_054.NEF', 'flat_066.NEF'] ``` Using `[:50]` on the following. To be precise here are the given image names: ``` RAISE ['flat_022.NEF', 'flat_008.NEF', 'flat_053.NEF', 'flat_034.NEF', 'flat_048.NEF', 'flat_021.NEF', 'flat_091.NEF', 'flat_097.NEF', 'flat_077.NEF', 'flat_019.NEF', 'flat_032.NEF', 'flat_046.NEF', 'flat_037.NEF', 'flat_085.NEF', 'flat_061.NEF', 'flat_093.NEF', 'flat_010.NEF', 'flat_100.NEF', 'flat_086.NEF', 'flat_017.NEF', 'flat_004.NEF', 'flat_071.NEF', 'flat_013.NEF', 'flat_020.NEF', 'flat_079.NEF', 'flat_023.NEF', 'flat_074.NEF', 'flat_049.NEF', 'flat_047.NEF', 'flat_030.NEF', 'flat_015.NEF', 'flat_018.NEF', 'flat_003.NEF', 'flat_016.NEF', 'flat_040.NEF', 'flat_082.NEF', 'flat_062.NEF', 'flat_029.NEF', 'flat_089.NEF', 'flat_033.NEF', 'flat_058.NEF', 'flat_043.NEF', 'flat_056.NEF', 'flat_088.NEF', 'flat_014.NEF', 'flat_001.NEF', 'flat_009.NEF', 'flat_007.NEF', 'flat_054.NEF', 'flat_066.NEF', 'flat_065.NEF', 'flat_092.NEF', 'flat_038.NEF', 'flat_055.NEF', 'flat_051.NEF', 'flat_070.NEF', 'flat_099.NEF', 'flat_090.NEF', 'flat_067.NEF', 'flat_052.NEF', 'flat_080.NEF', 'flat_050.NEF', 'flat_035.NEF', 'flat_057.NEF', 'flat_078.NEF', 'flat_042.NEF', 'flat_006.NEF', 'flat_028.NEF', 'flat_060.NEF', 'flat_041.NEF', 'flat_096.NEF', 'flat_063.NEF', 'flat_012.NEF', 'flat_005.NEF', 'flat_076.NEF', 'flat_045.NEF', 'flat_039.NEF', 'flat_044.NEF', 'flat_025.NEF', 'flat_098.NEF', 'flat_059.NEF', 'flat_036.NEF', 'flat_068.NEF', 'flat_026.NEF', 'flat_069.NEF', 'flat_075.NEF', 'flat_083.NEF', 'flat_031.NEF', 'flat_024.NEF', 'flat_002.NEF', 'flat_072.NEF', 'flat_064.NEF', 'flat_081.NEF', 'flat_011.NEF', 'flat_087.NEF', 'flat_027.NEF', 'flat_084.NEF', 'flat_094.NEF', 'flat_073.NEF', 'flat_095.NEF'] Rafael 23/04/24 ['DSC03433.ARW', 'DSC03467.ARW', 'DSC03385.ARW', 'DSC03302.ARW', 'DSC03387.ARW', 'DSC03426.ARW', 'DSC03404.ARW', 'DSC03322.ARW', 'DSC03364.ARW', 'DSC03545.ARW', 'DSC03463.ARW', 'DSC03367.ARW', 'DSC03471.ARW', 'DSC03400.ARW', 'DSC03420.ARW', 'DSC03300.ARW', 'DSC03443.ARW', 'DSC03533.ARW', 'DSC03386.ARW', 'DSC03370.ARW', 'DSC03522.ARW', 'DSC03299.ARW', 'DSC03330.ARW', 'DSC03466.ARW', 'DSC03354.ARW', 'DSC03489.ARW', 'DSC03518.ARW', 'DSC03436.ARW', 'DSC03363.ARW', 'DSC03398.ARW', 'DSC03407.ARW', 'DSC03472.ARW', 'DSC03325.ARW', 'DSC03362.ARW', 'DSC03297.ARW', 'DSC03465.ARW', 'DSC03552.ARW', 'DSC03324.ARW', 'DSC03409.ARW', 'DSC03475.ARW', 'DSC03296.ARW', 'DSC03353.ARW', 'DSC03462.ARW', 'DSC03453.ARW', 'DSC03446.ARW', 'DSC03369.ARW', 'DSC03414.ARW', 'DSC03448.ARW', 'DSC03530.ARW', 'DSC03360.ARW', 'DSC03486.ARW', 'DSC03434.ARW', 'DSC03328.ARW', 'DSC03497.ARW', 'DSC03439.ARW', 'DSC03548.ARW', 'DSC03458.ARW', 'DSC03406.ARW', 'DSC03484.ARW', 'DSC03319.ARW', 'DSC03520.ARW', 'DSC03529.ARW', 'DSC03357.ARW', 'DSC03494.ARW', 'DSC03445.ARW', 'DSC03519.ARW', 'DSC03339.ARW', 'DSC03384.ARW', 'DSC03487.ARW', 'DSC03318.ARW', 'DSC03508.ARW', 'DSC03456.ARW', 'DSC03401.ARW', 'DSC03315.ARW', 'DSC03536.ARW', 'DSC03341.ARW', 'DSC03305.ARW', 'DSC03473.ARW', 'DSC03373.ARW', 'DSC03537.ARW', 'DSC03468.ARW', 'DSC03470.ARW', 'DSC03307.ARW', 'DSC03459.ARW', 'DSC03355.ARW', 'DSC03480.ARW', 'DSC03356.ARW', 'DSC03528.ARW', 'DSC03304.ARW', 'DSC03438.ARW', 'DSC03320.ARW', 'DSC03412.ARW', 'DSC03316.ARW', 'DSC03541.ARW', 'DSC03477.ARW', 'DSC03321.ARW', 'DSC03430.ARW', 'DSC03344.ARW', 'DSC03431.ARW', 'DSC03378.ARW'] ``` Note that the training set is currently set to the first half of these 100 image names. Training is not faster because the training set size is half the dataset one, as also have to estimate PRNU for testing set. For RAISE: ``` minColor=4 maxColor=7952 ``` ![mean_flat-field_nef_wavelet_multiple_colors](/attachments/1765cd3e-e84b-4607-ad99-a641de280dc0) Modified contrast and brightness: ![image](/attachments/2ef4c83a-e89a-4ee1-ba3f-c59e87d5762f) Rafael 23/04/24: ``` minColor=676 maxColor=1960 ``` ![mean_rafael_230424_wavelet_multiple_colors](/attachments/244cceec-ad23-47e2-bc1c-6366346b9e72) Modified contrast and brightness: ![image](/attachments/f43a3136-e6f9-421b-97ed-2ed7f3c61113) Well in fact estimated PRNU may be a bit different as the scale is no more single camera images dependent but across cameras. `{min,max}Color` is correctly computed by considering both training and testing sets. As expected have wider scale due to taking into account both testing sets: ``` minColor=0 maxColor=7952 ``` ```diff @@ -62,6 +63,7 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else rescaleIfNeeded = returnSingleColorChannelImage if computeExtremes else rescaleRawImageForDenoiser if not computeExtremes: print(f'{minColor=} {maxColor=}') + exit(2) print('Extracting noise of testing images') for camera in tqdm(IMAGES_CAMERAS_FOLDER, 'Camera'): for cameraTestingImageIndex in tqdm(range(numberOfTestingImages), 'Camera testing image index'): ``` So if one of both estimated PRNU generated by `attribute_source_camera.py` have an issue, it should be Rafael one as the intensity range is far narrower than RAISE. ```diff @@ -95,7 +96,7 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else imagePrnuEstimateNpArray = multipleColorsImage - multipleColorsDenoisedImage cameraIterativeMean = camerasIterativeMean[camera] cameraIterativeMean.add(imagePrnuEstimateNpArray) - if cameraIndex == numberOfCameras - 1: + if cameraIndex == numberOfCameras - 1 and cameraTrainingImageIndex == numberOfTrainingImages - 1: numberOfTrainingImagesAccuracy = 0 print(f'{numberOfTestingImages=} {numberOfCameras=}') # Loop over each camera testing image folder. ``` to only run prediction once have estimated the PRNUs on the whole training sets. Also note that one of both camera estimated PRNU may have a different resolution to be comparable with the other camera images.

image.png

6.5 MiB

mean_flat-field_nef_wavelet_multiple_colors.png

17 MiB

image.png

8.7 MiB

mean_rafael_230424_wavelet_multiple_colors.png

49 MiB

Benjamin_Loison commented

2024-05-03 01:33:43 +02:00

cameraTestingImageIndex=43 camera='RAISE' actualCamera='RAISE' distance=0.008670542635320987
cameraTestingImageIndex=43 camera='Rafael 23/04/24' actualCamera='RAISE' distance=0.008633948300786527
Predicted camera Rafael 23/04/24 bad████████████████████████████████████████████████████████████▋          | 43/50 [00:14<00:02,  3.49it/s]
                                                                                                                                          cameraTestingImageIndex=44 camera='RAISE' actualCamera='RAISE' distance=0.007710941332830545
cameraTestingImageIndex=44 camera='Rafael 23/04/24' actualCamera='RAISE' distance=0.007823505250551036
Predicted camera RAISE good  88%|█████████████████████████████████████████████████████████████████         | 44/50 [00:14<00:01,  3.49it/s]
                                                                                                                                          cameraTestingImageIndex=45 camera='RAISE' actualCamera='RAISE' distance=0.010791458585098128
cameraTestingImageIndex=45 camera='Rafael 23/04/24' actualCamera='RAISE' distance=0.010764767409709579
Predicted camera Rafael 23/04/24 bad███████████████████████████████████████████████████████████████▌       | 45/50 [00:14<00:01,  3.53it/s]

0.51 accuracy according to RAISE,Rafael 23_04_24_wavelet_accuracy_of_camera_source_attribution.npy.

The estimated PRNUs seem to make sense:

RAISE:

Modified contrast and brightness:

Rafael:

Modified contrast and brightness:

``` cameraTestingImageIndex=43 camera='RAISE' actualCamera='RAISE' distance=0.008670542635320987 cameraTestingImageIndex=43 camera='Rafael 23/04/24' actualCamera='RAISE' distance=0.008633948300786527 Predicted camera Rafael 23/04/24 bad████████████████████████████████████████████████████████████▋ | 43/50 [00:14<00:02, 3.49it/s] cameraTestingImageIndex=44 camera='RAISE' actualCamera='RAISE' distance=0.007710941332830545 cameraTestingImageIndex=44 camera='Rafael 23/04/24' actualCamera='RAISE' distance=0.007823505250551036 Predicted camera RAISE good 88%|█████████████████████████████████████████████████████████████████ | 44/50 [00:14<00:01, 3.49it/s] cameraTestingImageIndex=45 camera='RAISE' actualCamera='RAISE' distance=0.010791458585098128 cameraTestingImageIndex=45 camera='Rafael 23/04/24' actualCamera='RAISE' distance=0.010764767409709579 Predicted camera Rafael 23/04/24 bad███████████████████████████████████████████████████████████████▌ | 45/50 [00:14<00:01, 3.53it/s] ``` ![RAISE,Rafael 23_04_24_wavelet_accuracy_of_camera_source_attribution](/attachments/00ae9a87-bbf2-4b8a-b3b1-940cb63fb00c) `0.51` accuracy according to `RAISE,Rafael 23_04_24_wavelet_accuracy_of_camera_source_attribution.npy`. The estimated PRNUs seem to make sense: RAISE: ![RAISE,Rafael 23_04_24_wavelet_estimated_prnu_camera_RAISE](/attachments/16afb19c-1c03-4d24-abfc-a6577a023f7a) Modified contrast and brightness: ![image](/attachments/b685127d-26f7-44dd-a777-068dada78d9f) Rafael: ![RAISE,Rafael 23_04_24_wavelet_estimated_prnu_camera_Rafael 23_04_24](/attachments/f19e79b5-18cc-49d8-86b8-3be3d9cd36f6) Modified contrast and brightness: ![image](/attachments/6a4f229c-d3d8-43eb-8479-e29285d3fa51)

image.png

6.5 MiB

RAISE,Rafael 23_04_24_wavelet_estimated_prnu_camera_RAISE.png

17 MiB

RAISE,Rafael 23_04_24_wavelet_estimated_prnu_camera_Rafael 23_04_24.png

33 MiB

image.png

4.3 MiB

RAISE,Rafael 23_04_24_wavelet_accuracy_of_camera_source_attribution.svg

32 KiB

Benjamin_Loison commented

2024-05-03 03:21:59 +02:00

Rafael:

RAISE:

Still accurate as of 13/05/24.

![RAISE,Rafael 23_04_24_mean_accuracy_of_camera_source_attribution](/attachments/95fb27ba-cc4a-441e-a029-f59e29a85dff) Rafael: ![RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_Rafael 23_04_24](/attachments/b292ceb4-3379-49f6-9ced-8885f2ba34e6) RAISE: ![RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_RAISE](/attachments/d74f68ca-84e3-404e-b218-dcaff92acd7a) Still accurate as of 13/05/24.

RAISE,Rafael 23_04_24_mean_accuracy_of_camera_source_attribution.svg

32 KiB

RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_Rafael 23_04_24.png

2.9 MiB

RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_RAISE.png

5.0 MiB

Benjamin_Loison commented

2024-05-03 03:24:38 +02:00

Complexity analysis between tests of all training iterations and only the last one.

Complexity of tests of all training iterations: TESTS * TRAINING_ITERATIONS
Complexity of the last iteration: TESTS

The question is do TESTS are significant enough such that there is a significant gap with TESTS * TRAINING_ITERATIONS? It seems to proceed to 4 / 50 TESTS per second, hence TESTS takes 12.5 seconds and TESTS * TRAINING_ITERATIONS takes 10.4 minutes.

What we expect is either at first iteration already predicting perfectly or progressively predicting better and better but there is no reason to have a linear improvement in relation with the number of iterations.

Have the feeling that when proceed to all iterations the program get killed because of memory. As far as I understand there is no significant (less than one image) memory additional usage at each iteration, so it is unclear why we observe this behavior, otherwise could before each iteration force the garbage collector (if there is any) to work if it does not perfectly already. Related to Benjamin-Loison/cpython/issues/21.

Complexity analysis between tests of all training iterations and only the last one. Complexity of tests of all training iterations: `TESTS * TRAINING_ITERATIONS` Complexity of the last iteration: `TESTS` The question is do `TESTS` are significant enough such that there is a significant gap with `TESTS * TRAINING_ITERATIONS`? It seems to proceed to 4 / 50 `TESTS` per second, hence `TESTS` takes 12.5 seconds and `TESTS * TRAINING_ITERATIONS` takes 10.4 minutes. What we expect is either at first iteration already predicting perfectly or progressively predicting better and better but there is no reason to have a linear improvement in relation with the number of iterations. Have the feeling that when proceed to all iterations the program get killed because of memory. As far as I understand there is no significant (less than one image) memory additional usage at each iteration, so it is unclear why we observe this behavior, otherwise could before each iteration force the garbage collector (if there is any) to work if it does not perfectly already. Related to [Benjamin-Loison/cpython/issues/21](https://github.com/Benjamin-Loison/cpython/issues/21).

Benjamin_Loison commented

2024-05-03 04:01:24 +02:00

Note that if Denoiser.MEAN works fine, later on or even right now we will be interested in not using it as it requires identical scene images.

Note that if `Denoiser.MEAN` works fine, later on or even right now we will be interested in not using it as it requires identical scene images.

Benjamin_Loison commented

2024-05-03 04:12:07 +02:00

Well as quite expected:

ameraTestingImageIndex=49 camera='RAISE' actualCamera='RAISE' distance=0.04193251104554698
cameraTestingImageIndex=49 camera='Rafael 23/04/24' actualCamera='RAISE' distance=672.4978398431142
Predicted camera RAISE good  98%|██████████████████████████████████████████████████▉ | 49/50 [00:06<00:00,  7.61it/s]

...

ameraTestingImageIndex=49 camera='RAISE' actualCamera='RAISE' distance=0.03307361190973093
cameraTestingImageIndex=49 camera='Rafael 23/04/24' actualCamera='RAISE' distance=672.4977532060318
Predicted camera RAISE good  98%|██████████████████████████████████████████████████▉ | 49/50 [00:06<00:00,  7.74it/s]

Well as quite expected: ![RAISE,Rafael 23_04_24_mean_accuracy_of_camera_source_attribution](/attachments/bb04fbb7-b5f6-4fd6-8b8d-0029f7172d4f) ``` ameraTestingImageIndex=49 camera='RAISE' actualCamera='RAISE' distance=0.04193251104554698 cameraTestingImageIndex=49 camera='Rafael 23/04/24' actualCamera='RAISE' distance=672.4978398431142 Predicted camera RAISE good 98%|██████████████████████████████████████████████████▉ | 49/50 [00:06<00:00, 7.61it/s] ... ameraTestingImageIndex=49 camera='RAISE' actualCamera='RAISE' distance=0.03307361190973093 cameraTestingImageIndex=49 camera='Rafael 23/04/24' actualCamera='RAISE' distance=672.4977532060318 Predicted camera RAISE good 98%|██████████████████████████████████████████████████▉ | 49/50 [00:06<00:00, 7.74it/s] ```

RAISE,Rafael 23_04_24_mean_accuracy_of_camera_source_attribution.svg

34 KiB

Benjamin_Loison commented

2024-05-03 04:29:13 +02:00

Allowing using correct camera training set mean seems fine as it is quite easy to check what scene, hence dataset it is in the case of flat-field images. In fact for flat-field can attribute camera without extracting noise but just using flat-field images as they are. Should not guess what mean to use as the point is that we do not know later on. Testing them all seems to make sense.

Using iterative mean for denoising seems to make sense.

At least using a different mean per dataset part (i.e. training and testing) does not seem to make much sense finally:

@@ -67,14 +67,14 @@ def getMultipleColorsImage(singleColorChannelImages):
     multipleColorsImage = mergeSingleColorChannelImagesAccordingToBayerFilter(singleColorChannelImages)
     return multipleColorsImage
 
-def getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera):
-    singleColorChannelDenoisedImages = {color: denoise(singleColorChannelImages[color], DENOISER) if DENOISER != Denoiser.MEAN else cameraColorMeans[camera][color] for color in Color}
+def getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera, datasetPart):
+    singleColorChannelDenoisedImages = {color: denoise(singleColorChannelImages[color], DENOISER) if DENOISER != Denoiser.MEAN else cameraColorMeans[camera][datasetPart][color] for color in Color}
     multipleColorsDenoisedImage = mergeSingleColorChannelImagesAccordingToBayerFilter(singleColorChannelDenoisedImages)
     imagePrnuEstimateNpArray = multipleColorsImage - multipleColorsDenoisedImage
     return imagePrnuEstimateNpArray
 
 imagesCamerasFilePaths = {camera: [f'{IMAGES_CAMERAS_FOLDER[camera]}/{imagesCamerasFileName}' for imagesCamerasFileName in imagesCamerasFileNames[camera]] for camera in imagesCamerasFileNames}
-cameraColorMeans = {camera: getColorMeans(imagesCamerasFilePaths[camera][:numberOfTrainingImages], Color, minimalColorChannelCameraResolution) for camera in imagesCamerasFilePaths}
+cameraColorMeans = {camera: {getColorMeans(imagesCamerasFilePaths[camera][0 if datasetPart == 'training' else numberOfTrainingImages:numberOfTrainingImages + (0 if datasetPart == 'training' else numberOfTestingImages)], Color, minimalColorChannelCameraResolution) for datasetPart in ['training', 'testing']} for camera in imagesCamerasFilePaths}
 
+cameraColorMeans = {camera: {getColorMeans(imagesCamerasFilePaths[camera][0 if datasetPart == 'training' else numberOfTrainingImages:numbe
rOfTrainingImages + (0 if datasetPart == 'training' else numberOfTestingImages)], Color, minimalColorChannelCameraResolution) for datasetPa
rt in ['training', 'testing']} for camera in imagesCamerasFilePaths}
 
 from utils import silentTqdm
 #tqdm = silentTqdm
@@ -93,7 +93,7 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else
                 singleColorChannelImages = getSingleColorChannelImages(camera, numberOfTrainingImages + cameraTestingImageIndex)
                 multipleColorsImage = getMultipleColorsImage(singleColorChannelImages)
 
-                imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera)
+                imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera, 'testing')
 
                 cameraTestingImagesNoise[camera] = cameraTestingImagesNoise.get(camera, []) + [imagePrnuEstimateNpArray]
     for cameraTrainingImageIndex in tqdm(range(minimumNumberOfImagesCameras if computeExtremes else numberOfTrainingImages), 'Camera training image index'):
@@ -105,11 +105,11 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else
                 minColor, maxColor = updateExtremes(multipleColorsImage, minColor, maxColor)
                 continue
 
-            imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera)
+            imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera, 'training')
 
             cameraIterativeMean = camerasIterativeMean[camera]
             cameraIterativeMean.add(imagePrnuEstimateNpArray)

Have to understand why we have suddenly that good results.

Should compute RMS between both training set means and see if it is close to above 672.

For mean and wavelet denoisers should investigate first testing entry and 2 first training estimated PRNUs and first testing entry (only different in the case of mean denoiser) and the whole training set estimated PRNUs. Using box plots seem to make sense.

Allowing using correct camera training set mean seems fine as it is quite easy to check what scene, hence dataset it is in the case of flat-field images. In fact for flat-field can attribute camera without extracting noise but just using flat-field images as they are. Should not guess what mean to use as the point is that we do not know later on. Testing them all seems to make sense. Using iterative mean for denoising seems to make sense. At least using a different mean per dataset part (i.e. training and testing) does not seem to make much sense finally: ```diff @@ -67,14 +67,14 @@ def getMultipleColorsImage(singleColorChannelImages): multipleColorsImage = mergeSingleColorChannelImagesAccordingToBayerFilter(singleColorChannelImages) return multipleColorsImage -def getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera): - singleColorChannelDenoisedImages = {color: denoise(singleColorChannelImages[color], DENOISER) if DENOISER != Denoiser.MEAN else cameraColorMeans[camera][color] for color in Color} +def getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera, datasetPart): + singleColorChannelDenoisedImages = {color: denoise(singleColorChannelImages[color], DENOISER) if DENOISER != Denoiser.MEAN else cameraColorMeans[camera][datasetPart][color] for color in Color} multipleColorsDenoisedImage = mergeSingleColorChannelImagesAccordingToBayerFilter(singleColorChannelDenoisedImages) imagePrnuEstimateNpArray = multipleColorsImage - multipleColorsDenoisedImage return imagePrnuEstimateNpArray imagesCamerasFilePaths = {camera: [f'{IMAGES_CAMERAS_FOLDER[camera]}/{imagesCamerasFileName}' for imagesCamerasFileName in imagesCamerasFileNames[camera]] for camera in imagesCamerasFileNames} -cameraColorMeans = {camera: getColorMeans(imagesCamerasFilePaths[camera][:numberOfTrainingImages], Color, minimalColorChannelCameraResolution) for camera in imagesCamerasFilePaths} +cameraColorMeans = {camera: {getColorMeans(imagesCamerasFilePaths[camera][0 if datasetPart == 'training' else numberOfTrainingImages:numberOfTrainingImages + (0 if datasetPart == 'training' else numberOfTestingImages)], Color, minimalColorChannelCameraResolution) for datasetPart in ['training', 'testing']} for camera in imagesCamerasFilePaths} +cameraColorMeans = {camera: {getColorMeans(imagesCamerasFilePaths[camera][0 if datasetPart == 'training' else numberOfTrainingImages:numbe rOfTrainingImages + (0 if datasetPart == 'training' else numberOfTestingImages)], Color, minimalColorChannelCameraResolution) for datasetPa rt in ['training', 'testing']} for camera in imagesCamerasFilePaths} from utils import silentTqdm #tqdm = silentTqdm @@ -93,7 +93,7 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else singleColorChannelImages = getSingleColorChannelImages(camera, numberOfTrainingImages + cameraTestingImageIndex) multipleColorsImage = getMultipleColorsImage(singleColorChannelImages) - imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera) + imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera, 'testing') cameraTestingImagesNoise[camera] = cameraTestingImagesNoise.get(camera, []) + [imagePrnuEstimateNpArray] for cameraTrainingImageIndex in tqdm(range(minimumNumberOfImagesCameras if computeExtremes else numberOfTrainingImages), 'Camera training image index'): @@ -105,11 +105,11 @@ for computeExtremes in tqdm(([True] if minColor is None or maxColor is None else minColor, maxColor = updateExtremes(multipleColorsImage, minColor, maxColor) continue - imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera) + imagePrnuEstimateNpArray = getImagePrnuEstimateNpArray(singleColorChannelImages, multipleColorsImage, camera, 'training') cameraIterativeMean = camerasIterativeMean[camera] cameraIterativeMean.add(imagePrnuEstimateNpArray) ``` Have to understand why we have suddenly that good results. Should compute RMS between both training set means and see if it is close to above 672. For mean and wavelet denoisers should investigate first testing entry and 2 first training estimated PRNUs and first testing entry (only different in the case of mean denoiser) and the whole training set estimated PRNUs. Using box plots seem to make sense.

Benjamin_Loison referenced this issue

2024-05-03 13:19:04 +02:00

Implement correlation #66

Benjamin_Loison commented

2024-05-03 13:36:49 +02:00

Example for a given pixel:
RAISE: -1
Rafael: 1
Noise: 0.1
RMS: Rafael
Correlation: -0.1 with RAISE and 0.1 with Rafael according to signal.correlate2d (and now what do we do of this result?)

>>> signal.correlate2d([[1]], [[0.1]])
array([[0.1]])

>>> signal.correlate2d([[-1]], [[0.1]])
array([[-0.1]])

>>> signal.correlate2d([[-2]], [[0.1]])
array([[-0.2]])

>>> signal.correlate2d([[2]], [[0.1]])
array([[0.2]])

To have a single scalar measuring the correlation between 2 images, then apply RMS? Does it make sense on my example?

Example for a given pixel: RAISE: -1 Rafael: 1 Noise: 0.1 RMS: Rafael Correlation: -0.1 with RAISE and 0.1 with Rafael according to `signal.correlate2d` (and now what do we do of this result?) ```pycon >>> signal.correlate2d([[1]], [[0.1]]) array([[0.1]]) >>> signal.correlate2d([[-1]], [[0.1]]) array([[-0.1]]) >>> signal.correlate2d([[-2]], [[0.1]]) array([[-0.2]]) >>> signal.correlate2d([[2]], [[0.1]]) array([[0.2]]) ``` To have a single scalar measuring the correlation between 2 images, then apply RMS? Does it make sense on my example?

Benjamin_Loison commented

2024-05-03 15:01:01 +02:00

Can implement a dedicated issue for the mean denoiser support for attributing source camera.

Benjamin_Loison changed title from ~~Attribute camera source~~ to Attribute source camera

2024-05-03 15:01:10 +02:00

Benjamin_Loison commented

2024-05-06 03:25:17 +02:00

With bilateral denoiser:

With bilateral denoiser: ![RAISE,Rafael 23_04_24_bilateral_accuracy_of_camera_source_attribution](/attachments/6d3c6362-f26e-42d4-9abe-ce06aacee192) ![RAISE,Rafael 23_04_24_bilateral_estimated_prnu_camera_Rafael 23_04_24](/attachments/dc41854e-c8d9-4705-ba48-6898b929086f) ![RAISE,Rafael 23_04_24_bilateral_estimated_prnu_camera_RAISE](/attachments/cd705a05-4033-4d20-b5b3-d8a9aa747525)

RAISE,Rafael 23_04_24_bilateral_accuracy_of_camera_source_attribution.svg

34 KiB

RAISE,Rafael 23_04_24_bilateral_estimated_prnu_camera_Rafael 23_04_24.png

19 MiB

RAISE,Rafael 23_04_24_bilateral_estimated_prnu_camera_RAISE.png

20 MiB

Benjamin_Loison referenced this issue

2024-05-10 02:25:41 +02:00

Use same color map for different images to more accurately compare them #68

Benjamin_Loison commented

2024-05-13 02:32:56 +02:00

Is it the most appropriate, while staying in the use-case context, to compute {min,max}Color only on training images. Do not forget all denoisers.

Is it the most appropriate, while staying in the use-case context, to compute `{min,max}Color` only on training images. Do not forget all denoisers.

Benjamin_Loison commented

2024-05-13 12:05:49 +02:00

#72: Correctly implement iterative mean for camera attribution

- [ ] #72: Correctly implement iterative mean for camera attribution

Benjamin_Loison referenced this issue from a commit

2024-05-13 12:52:29 +02:00

#63: use `scipy.stats.pearsonr`

Benjamin_Loison commented

2024-05-13 20:23:51 +02:00

![RAISE,Rafael 23_04_24_mean_accuracy_of_camera_source_attribution](/attachments/da08a7f8-e393-42ab-bc42-9044c209a675) ![RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_Rafael 23_04_24](/attachments/9a7e46af-6485-4c5c-a786-dc37c4818207) ![RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_RAISE](/attachments/86e079f7-dc07-44be-bba9-0d11be5d3833)

RAISE,Rafael 23_04_24_mean_accuracy_of_camera_source_attribution.svg

32 KiB

RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_Rafael 23_04_24.png

6.5 KiB

RAISE,Rafael 23_04_24_mean_estimated_prnu_camera_RAISE.png

14 KiB

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Benjamin_Loison/Robust_image_source_identification_on_modern_smartphones#63