In this assignment you will:
In this assignment you have to use GitHub Issues as follows:
In this assignment you will do C development using OpenMP using starter code
provided to you in the external_filters
directory of the img432app
repo.
This directory contains the code for three C programs that can be invoked
from the command-line to apply image filters to jpeg files. The objective
is to have the Java app to call these programs via Docker, and then to modify
these programs to make them data-parallel.
In this question we make it possible for our Java app to invoke the three filter programs (which, for now, are sequential) installed in the Docker container. Create and address the following GitHub issue:
Issue Title: External C filters
Issue Label: enhancement
Issue Description: Add three filters to the app, called
DPEdge
,DPFunk1
, andDPFunk2
. TheDPEdge
filter invokes thejpegedge
program in the Docker container, theDPFunk1
filter invokes thejpegfunk1
program in the Docker container, and theDPFunk2
filter invokes thejpegfunk2
program in the Docker container.
Follow the instructions in this README.md file (which is in ics432imgapp/external_filters/README.md
but better viewed in the browser) to get started with Docker.
Notes:
jpegfunk1
and jpegfunk2
right now do the exact same thing and are in fact the same code. In questions below we will parallelize them in different ways. Also, they take a while!
We will not test your code for any kind of error-handling. Just make sure it works in the cases in which everything is ok (i.e., all files are there, output directory is writable, etc).
Hints:
Since the external filters are C programs that directly take in and produce output files, it will not be possible to overlap I/O and computation. So it’s probably a very good idea to extend your WorkUnit
class into, say, a WorkUnitExternal
class in which the “read input” and “write output” methods do nothing, and the “process” method does it all, and by “all” we mean “invoke docker with the right command-line arguments”.
Invoking an external process in Java is pretty easy. Say you want your Java program to invoke the following command ls -la /tmp
. then you would do it this way:
Create and address the following GitHub issue:
Issue Title: Data-parallel edge detection filter
Issue Label: enhancement
Issue Description: Make the
jpegedge
program data-parallel (and still callable from the Java app)
The jpegedge
program (installed in the Docker container and with source
code in c_filters/src/jpegedge.c
) implements the Sobel filter, which is
used for edge detection (so as to identify individual objects in a scene).
It takes two command-line arguments:
jpegedge ./image.jpg /tmp/edge_image.jpg
will apply the edge detection filter to an image in file ./image.jpg
and
save the filtered image to file /tmp/Edge_image.jpg
.
For testing in this question we will use the large (gray scale) image on the left below (the filtered image is also shown on the right). Click on the image or the link below it to get the full size test image, which you then should download.
Test temple.jpg image | Filtered image |
Note: You need to re-build the docker container each time you modify the C code for testing using the container (and for use in the Java app). You can of course test the filters without the Java app by “logging in” to the Docker container and doing everything command-line.
Todo: Have the jpegedge
program take a required 3rd command-line
argument that specifies how many threads should be used to run the filter
in data-parallel fashion. The number of threads passed to the program by
the Java app is based on the value of the corresponding slider in the main
window. The program should exit gracefully if the (number of or values of)
command-line arguments are incorrect.
Todo: Using OpenMP, enhance the program to make the filter
data-parallel, using the specified number of threads in the 3rd
command-line argument. Do this by just adding a simple #pragma omp
parallel
and a #pragma omp for
before the first loop of the three nested
loops that go through the pixels to process (i.e., the outer loop). Since
all pixel computations are identical, there is no need to experiment with
any fancy loop scheduling options, etc.
Todo: In your README file report on filter execution times when using 1, 2, and 4 threads on your machine. For 2 and 4 threads, give the speedup and the parallel efficiency. If your machine has more than 4 cores, then feel free to report on numbers for more threads.
Create and address the following GitHub issue:
Issue Title: Data-parallel funky filter (naive)
Issue Label: enhancement
Issue Description: Make the
jpegfunk1
program data-parallel using a naive parallelization strategy
Your developer friends has been working with an artist who’s exploring strange image filters to automatically produce images (insert here philosophical discussion of the definition of what art really is and of the place of automation in art). Your friend has implemented one of the artist’s filters, which is called “Funk”. This filter is quite computationally expensive, and some pixels are much more expensive to compute than others. Your friend doesn’t know OpenMP and has asked you to make the filter data-parallel.
The use of jpegfunk1
program (installed in the container and with source code in c_filters/src/jpegfunk1
) is
used exactly like the jpegedge
program in the previous question.
Testing in this question we will use the image on the left below (the filtered image is also shown on the right). Click on the image or the link below it to get the full size test image, which you then should download. (Applying the filter to that image can take several minutes.)
Test humu.jpg image | Filtered image |
Todo: Have the program take a required 3rd command-line argument that specifies how many threads should be used to run the filter in a data-parallel fashion.
Todo: Using OpenMP, enhance the program so that the filter is executed in a data-parallel fashion using the specified number of threads in the 3rd command-line argument. Do this by just adding a simple #pragma omp parallel
and a #pragma omp for
before the first loop of the three nested loops that go through the pixels to process (i.e., the outer loop), exactly as you did in Question #1.
Todo: Have each thread print to the terminal how much time it spent in the for loop.
Todo: In your README file:
Report filter execution times when using 1, 2, and 4 threads. For 2 and 4 threads, give the speedup and the parallel efficiency.
Say whether the results are better or worse than in Question #2 for the Edge filter.
When using 4 threads, report on the time each thread spends in the loop. Say whether you would say that the execution is well load-balanced
Hints:
For having each thread prints the time spent in the for loop, using the nowait
option is key (see lecture notes)
A portable, not very accurate but sufficient for out purpose, way to measure time in a C program is to use
gettimeofday
. Here is a fragment of code that showcases its use:
#include <stdio.h>
#include <sys/time.h>
...
struct timeval start, end;
gettimeofday(&start, NULL);
...
gettimeofday(&end, NULL);
printf("elapsed: %.2lf\n",
((1000000.0 * (end.tv_sec - start.tv_sec) +
(1.0 * (end.tv_usec - start.tv_usec)))/1000000.0));
...
Create and address the following GitHub issue:
Issue Title: Data-parallel funky filter (clever)
Issue Label: enhancement
Issue Description: Make the
jpegfunk2
program data-parallel using a “clever” parallelization strategy
The load-imbalance problem in the previous question is severe, and kills parallel efficiency. Come up with a better use of OpenMP than in Question #3. Specifically, when using 4 threads, your implementation should have all threads compute for the same amount of time as much as possible.
Hints:
Todo: In your README file, include the following information:
Report filter execution times when using 1, 2, and 4 threads. For 2 and 4 threads, give the speedup and the parallel efficiency.
When using 4 threads, report on the time each thread spends in the loop.
Compare these results to those in Question #3.
Explain how you used OpenMP to improve the loop parallelization.