Correctly implement iterative mean for camera attribution #72
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Implementation doubt
I doubt that current iterative mean leverages the best its current knowledge of the training set.
More precisely we add to the estimated PRNU mean the image substracted the average of already considered training images noises but each training image noise depends on the knowledge it was aware of when computed it. So maybe it would be better to recompute totally the estimated PRNU by recomputing its components leveraging the up-to-date mean.
Maybe it is equivalent.
Formalization
Let us formalize it step-by-step for the few first steps:
Let us denote respectively
image_{camera}^{training_j}andimage_{camera}^{test_j}for the $j$-th training (respectively testing) image for the cameracamera, withcamera = raisefor RAISE flat-field andcamera = rafaelfor Rafael.Let us denote the denoised image
imageas\text{denoiser}(image, k, camera)with\text{denoiser}trained up to (included) thekcameracameratraining image and the estimated PRNU for the cameracameraat training steplasprnu_{camera}^l. That is\text{denoiser}(k, camera) = \text{mean}(image_{camera}^{training_{[0..k]}})As a reminder the mean based denoiser principle is based on
prnu_{camera}^{\text{len}(image_{camera}^{training} - 1)} = \text{mean}(training\_image - \text{denoiser}(\text{len}(image_{camera}^{training} - 1), camera) \text{ for } training\_image \text{ in } image_{camera}^{training}).The following in fact consider one arbitrary camera:
Let us consider the first image
image_{camera}^{training_0}, thenprnu_{camera}^0 = image_{camera}^{training_0} − denoiser(0, camera). However, by definition asdenoiser(0, camera) = image_{camera}^{training_0},prnu_{camera}^0 = image_0withimage_0denoting the null image.Let us consider the first 2 images
image_{camera}^{training_[0, 1]}, then there is a choice between:prnu_{camera}^1 = mean(image_{camera}^{training_j} − denoiser(j, camera) \text{ for } j \text{ in } [0, 1])prnu_{camera}^1 = mean(image_{camera}^{training_j} − denoiser(1, camera) \text{ for } j \text{ in } [0, 1])So more generally at learning step
lhave to choose between either:prnu_{camera}^l = \text{mean}(image_{camera}^{training_j} − denoiser(j, camera) \text{ for } j \text{ in } [0, 1, ..., l])prnu_{camera}^l = \text{mean}(image_{camera}^{training_j} − denoiser(l, camera) \text{ for } j \text{ in } [0, 1, ..., l])the second seems to leverage more the current knowledge of the training set but are both choices equivalent?
I have the feeling that both choices are different let us show a counter example for the first 2 images case:
prnu_{camera}^1 = \text{mean}(image_{camera}^{training_j} − denoiser(j, camera) \text{ for } j \text{ in } [0, 1]) = \text{mean}(image_0, image_{camera}^{training_1} - \text{mean}(image_{camera}^{training_0}, image_{camera}^{training_1})) = image_{camera}^{training_1} - \text{mean}(image_{camera}^{training_0}, image_{camera}^{training_1}) = \text{mean}(image_{camera}^{training_j} - mean(image_{camera}^{training_{[0, 1]}}) \text{ for } j \text{ in } [1])prnu_{camera}^1 = \text{mean}(image_{camera}^{training_j} − denoiser(1, camera) \text{ for } j \text{ in } [0, 1]) = \text{mean}(image_{camera}^{training_j} - mean(image_{camera}^{training_{[0, 1]}}) \text{ for } j \text{ in } [0, 1])Conclusion
So the first choice is not equivalent as the second choice contains a very probably not null additional component.
Related to #57.
Implementation
So now let us correct the implementation to implement:
prnu_{camera}^l = \text{mean}(image_{camera}^{training_j} − denoiser(l, camera) \text{ for } j \text{ in } [0, 1, ..., l])denoiser(l, camera)can be implemented efficiently thanks toiterativeMeanbut have to be computed before starting any PRNU estimation step.In this case
prnu_{camera}^lcannot be implemented withiterativeMeanas we are not just adding components, at least it does not seem clear what equivalent adding component would be able to leverageiterativeMean.Have to pay attention at implementation about memory quantity usage.
Unclear order of magnitude of memory necessary to load raw images into memory, the
ls -lSshow file sizes ranging from 18,697,276 to 20,720,477 bytes, so these files are probably compressed otherwise they would no have such significant size difference.filereturnsflat-field/nef/flat_001.NEF: TIFF image data, big-endian, direntries=27, height=0, bps=0, compression=none, PhotometricInterpretation=RGB, manufacturer=NIKON CORPORATION, model=NIKON D7000, orientation=upper-left, width=0it seems to indicate no compression.Anyway can first implement the first approach which was quite the case and then the second better one.
If we expand:
\text{denoiser}(k, camera) = \text{mean}(image_{camera}^{training_{[0..k]}})in:
prnu_{camera}^l = \text{mean}(image_{camera}^{training_j} − denoiser(l, camera) \text{ for } j \text{ in } [0, 1, ..., l]) = \text{mean}(image_{camera}^{training_j} − \text{mean}(image_{camera}^{training_{[0..l]}}) \text{ for } j \text{ in } [0, 1, ..., l])Python pseudo code:
This is about training but there is also testing.
In theory we should achieve the same accuracy as when training directly on the whole dataset, that is from experiment 100 % of accuracy.
cameraColorMeansis the actual variable formean_image_training_0_l_camera.cameraColorMeans[camera][color].add(singleColorChannelImages[color]is equivalent tomean_image_training_0_l_camera.add(image_training_l_camera).50 images for both cameras take about 22 GB of memory.
does not work as expected.
Related to #62.
Based on PRNU_extraction/issues/8 it seems quite clear that we should not have no good prediction for one class. However, note that the mentioned example considers not all images, only a crop and use a Wavelet denoiser.
Related to src/commit/be83fcf154ba144045e296f2f0e6d0d8deb58ca4/datasets/raise/fft/verify_dots.py#L22.