README.md
1 # TensorFlow Spectrogram Example
2
3 This example shows how you can load audio from a .wav file, convert it to a
4 spectrogram, and then save it out as a PNG image. A spectrogram is a
5 visualization of the frequencies in sound over time, and can be useful as a
6 feature for neural network recognition on noise or speech.
7
8 ## Building
9
10 To build it, run this command:
11
12 ```bash
13 bazel build tensorflow/examples/wav_to_spectrogram/...
14 ```
15
16 That should build a binary executable that you can then run like this:
17
18 ```bash
19 bazel-bin/tensorflow/examples/wav_to_spectrogram/wav_to_spectrogram
20 ```
21
22 This uses a default test audio file that's part of the TensorFlow source code,
23 and writes out the image to the current directory as spectrogram.png.
24
25 ## Options
26
27 To load your own audio, you need to supply a .wav file in LIN16 format, and use
28 the `--input_audio` flag to pass in the path.
29
30 To control how the spectrogram is created, you can specify the `--window_size`
31 and `--stride` arguments, which control how wide the window used to estimate
32 frequencies is, and how widely adjacent windows are spaced.
33
34 The `--output_image` flag sets the path to save the image file to. This is
35 always written out in PNG format, even if you specify a different file
36 extension.
37
38 If your result seems too dark, try using the `--brightness` flag to make the
39 output image easier to see.
40
41 Here's an example of how to use all of them together:
42
43 ```bash
44 bazel-bin/tensorflow/examples/wav_to_spectrogram/wav_to_spectrogram \
45 --input_wav=/tmp/my_audio.wav \
46 --window=1024 \
47 --stride=512 \
48 --output_image=/tmp/my_spectrogram.png
49 ```
50