/usr/share/sounds/deepin/stereo/desktop-login.wav
syli@syli-PC:~/work/repo/Demo/pa$ soxi desktop-login.wav
Input File : 'desktop-login.wav'
Channels : 2
Sample Rate : 44100
Precision : 16-bit
Duration : 00:00:07.00 = 308700 samples = 525 CDDA sectors
File Size : 1.23M
Bit Rate : 1.41M
Sample Encoding: 16-bit Signed Integer PCM
syli@syli-PC:~/work/repo/Demo/pa$ ls -al desktop-login.wav
-rw-r--r-- 1 root root 1234878 6月 14 14:53 desktop-login.wav
syli@syli-PC:~/work/repo/Demo/pa$ file desktop-login.wav
desktop-login.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz
duration = samples / (sample rate)
7 s = 308700 samples / 44100
1 sample = (Sample Encoding) * Channels / 8 bit
1 sample = 16(采样深度) * 2 / 8(bit) = 4 (字节)
size = (1 sample size) * samples
size = 4 * 308700 = 1,234,800 (字节)
整个文件大小 = 1234878 (字节)
非数据文件大小 = 1,234,878 - 1,234,800 = 78(字节)
Bit Rate = (Sample Rate) * (1 sample size) (kb/s)
= (Sample Rate) * ((Sample Encoding) * Channels) (kb/s)
1.41M = 44100 * 16 * 2 / 1000 / 1000 (Mb/s)
查看十六进制数据
hexdump -C desktop-login.wav
16bit 双声道示例
syli@syli-PC:~/work/repo/Demo/pa$ head hex.txt
00000000 52 49 46 46 b6 d7 12 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt |
00000010 10 00 00 00 01 00 02 00 44 ac 00 00 10 b1 02 00 |........D.......|
00000020 04 00 10 00 64 61 74 61 70 d7 12 00 03 00 09 00 |....datap.......|
00000030 01 00 05 00 06 00 08 00 05 00 08 00 01 00 02 00 |................|
00000040 06 00 07 00 05 00 05 00 02 00 03 00 05 00 07 00 |................|
00000050 03 00 04 00 03 00 04 00 05 00 06 00 01 00 02 00 |................|
地址 | 示例 | 说明 |
---|---|---|
1 - 4 | “RIFF” | Marks the file as a riff file. Characters are each 1 byte long. 固定为0x52494646,标识为RIFF格式 |
5 - 8 | File size (integer) | Size of the overall file - 8 bytes, in bytes (32-bit integer). Typically, you’d fill this in after creation. 块数据域大小(Chunk Size),即从下一个地址开始,到文件末尾的总字节数,或者文件总字节数-8。 从0x08开始一直到文件末尾,都是ID为"RIFF"块的内容,其中会包含两个子块,"fmt “和"data” 0x0012d7b6 = 1,234,870 = 整个文件大小 - 8 |
9 -12 | “WAVE” | File Type Header. For our purposes, it always equals “WAVE”. 类型码(Form Type),WAV文件格式标记,即"WAVE"四个字母 |
13-16 | “fmt " | Format chunk marker. Includes trailing null "fmt "子块(0x666D7420),注意末尾的空格; |
17-20 | 16 | Length of format data as listed above 前面报文数据(SubChunk Size)的长度 |
21-22 | 1 | Type of format (1 is PCM) - 2 byte integer 编码格式(Audio Format),1代表PCM无损格式; |
23-24 | 2 | Number of Channels - 2 byte integer 通道channels数量:2 |
25-28 | 44100 | Sample Rate - 32 byte integer. Common values are 44100 (CD), 48000 (DAT). Sample Rate = Number of Samples per second, or Hertz. 采样率0xAC44 = 44100 采样率也就是每秒的采样数,或者HZ; |
29-32 | 176400 | (Sample Rate * BitsPerSample * Channels) / 8. 传输速率(Byte Rate),每秒数据字节数,SampleRate * Channels * BitsPerSample / 8 0x02 B110 = 176400 |
33-34 | 4 | (BitsPerSample * Channels) / 8 每个采样所需的字节数,BitsPerSample*Channels/8 |
35-36 | 16 | Bits per sample 单个采样位深(Bits Per Sample),可选8、16或32 |
37-40 | “data” | “data” chunk header. Marks the beginning of the data section. "data"子块,标识数据部分的开始;0xs64 61 74 61 对应data字符串 |
41-44 | File size (data) | Size of the data section. 子块数据域大小(SubChunk Size)0x 12 d7 70 = 1,234,800 |
如果fmt SubChunk Size等于0x10(16),表示头部不包含附加信息,即WAV头部信息长度为44;如果等于0x12(18),则包含附加信息,此时头部信息长度大于44。
当WAV头部包含附加信息时,fmt SubChunk Size长度为18,并且紧随是另一个子块,这个包含了一些自定义的附加信息,接着往下才是"data"子块。
pcm size = (bytes per sample) * samples
= ((Sample Encoding) * Channels / 8 bits) * samples
= 16 * 2 / 8 * 308700
= 1,234,800 bytes
77176 0012d770 ff ff 00 00 01 00 02 00 01 00 01 00 ff ff 00 00 |................|
77177 0012d780 00 00 01 00 00 00 ff ff ff ff ff ff 02 00 02 00 |................|
77178 0012d790 01 00 ff ff ff ff 00 00 00 00 00 00 4c 49 53 54 |............LIST|
77179 0012d7a0 1a 00 00 00 49 4e 46 4f 49 53 46 54 0e 00 00 00 |....INFOISFT....|
77180 0012d7b0 4c 61 76 66 35 36 2e 34 30 2e 31 30 31 00 |Lavf56.40.101.|
77181 0012d7be
计算文件大小:
0x77180 * 16 - 2 = 1,234,878
PCM音频数据大小 = 1,234,878 - 44(报文头) - 34(报文尾) = 1234800
Lavf56.40.101:说明这个音频文件是用ffmpeg编码的,lavf指的是libavformat,是ffmpeg的一个组件,后面数字是版本号;
PCM(Pulse Code Modulation):脉冲编码调制(PCM)是一种用于数字表示采样模拟信号的方法。它是计算机、光盘、数字电话和其他数字音频应用中的标准数字音频形式。在PCM流中,模拟信号的振幅以均匀的间隔被定期采样,每个样本被量化为数字步长范围内最接近的值。
channel:声道数,常见单声道(mono)、立体声(stereo)、环绕声;
sample:一次采样,通常的sample bit指的是一个channnel上,一次采样的bit数(常见的sample bit 8/16/24/32bits)
rate:采样率,即每秒的采样次数,单位是frame;
frame:一个frame是一次采样时所有channel上的sample bit.即frame = channels * (sample bit)
Interleaved:交错模式,一种音频数据的记录方式,在交错模式下,数据以连续桢的形式存放,即首先记录完桢1的左声道样本和右声道样本(假设为立体声),再开始桢2的记录。而在非交错模式下,首先记录的是一个周期内所有桢的左声道样本,再记录右声道样本,数据是以连续通道的方式存储。多数情况下使用交错模式。
period:每当hardware buffer 中有peroid size个frame的空间时,硬件就产生中断,来通知alsa driver来往硬件写数据;
Period size:周期,每次硬件中断处理音频数据的Frame个数,对于音频设备的数据读写,单位是Frame。
buffer size:数据缓冲区大小,是由多个peroid组成。buffer size = peroid size * peroids,peroids相当于处理完一个buffer数据所需的硬件中断次数。
xrun指的是,声卡period一到,引发一个中断,告诉alsa驱动,要填入数据,或读走数据,但是,问题在于alsa的读取和写入操作必须用户调用writei和readi才会发生的,它不会去缓存数据。如果上层没有用户调用writei和readi,那么就会产生 overrun(录制时,数据都满了,还没被alsa驱动读走)和underrun(需要数据来播放,alsa驱动却不写入数据),统称为xrun。
softvol:Softvol是一个高级Linux声音架构(ALSA)插件,它将基于软件的音量控制添加到ALSA音频混音器(alsamixer)。当声卡没有硬件音量控制时,这是很有用的。softvol插件内置在ALSA中,不需要单独安装;软音量的另一个用例是当硬件音量控制无法将声音放大到超过某个阈值时,从而使音频文件变得过于安静。在这种情况下,可以创建软件放大器,以提高音量水平,牺牲一些质量的代价。
UCM:Alsa用例管理器(Use Case Manager)描述了如何为特定的用例(usecases)(如“播放音频”,“呼叫”)设置混音器。它还描述了如何修改混频器状态,以路由音频到某些输出和输入,以及如何控制这些设备。
frame计算示例:
Here is an alternative example for the above discussion.
Say we want to work with a stereo, 16-bit, 44.1 KHz stream, one-way (meaning, either in playback or in capture direction). Then we have:
'stereo' = number of channels: 2
1 analog sample is represented with 16 bits = 2 bytes
1 frame represents 1 analog sample from all channels; here we have 2 channels, and so:
1 frame = (num_channels) * (1 sample in bytes) = (2 channels) * (2 bytes (16 bits) per sample) = 4 bytes (32 bits)
To sustain 2x 44.1 KHz analog rate - the system must be capable of data transfer rate, in Bytes/sec:
Bps_rate = (num_channels) * (1 sample in bytes) * (analog_rate) = (1 frame) * (analog_rate) = ( 2 channels ) * (2 bytes/sample) * (44100 samples/sec) = 2*2*44100 = 176400 Bytes/sec
#include <stdio.h>
#include <stdlib.h>
#include "include/asoundlib.h"
#define MESSAGE(format, ...) printf("[%s][%s][%d]: " format "\n", __FILE__, __FUNCTION__, __LINE__, ##__VA_ARGS__)
static snd_output_t *log;
static unsigned buffer_time = 0;
static unsigned period_time = 0;
static int start_delay = 0;
static int stop_delay = 0;
void dump_hw_params(snd_pcm_t *handle, snd_pcm_hw_params_t *params, snd_output_t *log)
{
fprintf(stderr, "Params of device \"%s\":\n",
snd_pcm_name(handle));
fprintf(stderr, "--------------------\n");
snd_pcm_hw_params_dump(params, log);
fprintf(stderr, "--------------------\n");
}
snd_pcm_t* device_create(void)
{
int ret = -1; // return value;
int n;
char *hw_name = "default"; // sound card device name;
int direction = 0;
int channel = 2;
int sample_rate = 44100;
snd_pcm_uframes_t chunk_size = 1024;
snd_pcm_uframes_t buffer_size = 0;
snd_pcm_t *handle; //PCM设备句柄
snd_pcm_hw_params_t *hw_params; //硬件信息和PCM流配置
snd_pcm_sw_params_t *swparams;
snd_pcm_uframes_t start_threshold, stop_threshold;
/* step 1: 打开PCM,最后一个参数为0意味着标准配置 */
ret = snd_pcm_open(&handle, hw_name, SND_PCM_STREAM_PLAYBACK, 0);
if (ret < 0) {
perror("snd_pcm_open");
return NULL;
}
MESSAGE();
/* step 2: 创建snd_pcm_hw_params_t结构体 */
ret = snd_pcm_hw_params_malloc(&hw_params);
if (ret < 0) {
perror("snd_pcm_hw_params_malloc");
goto failed;
}
MESSAGE();
/* step 3: 初始化hw_params */
ret = snd_pcm_hw_params_any(handle, hw_params);
if (ret < 0) {
perror("snd_pcm_hw_params_any");
goto failed;
}
MESSAGE();
/* step 4: 初始化访问权限 */
// snd_pcm_readi/snd_pcm_writei access
ret = snd_pcm_hw_params_set_access(handle, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED);
if (ret < 0) {
perror("snd_pcm_hw_params_set_access");
goto failed;
}
MESSAGE();
/* step 5: 初始化采样格式SND_PCM_FORMAT_S16_LE */
ret = snd_pcm_hw_params_set_format(handle, hw_params, SND_PCM_FORMAT_S16_LE);
if (ret < 0) {
perror("snd_pcm_hw_params_set_format");
goto failed;
}
MESSAGE();
/* step 6: 设置采样率,如果硬件不支持我们设置的采样率,将使用最接近的 */
ret = snd_pcm_hw_params_set_rate_near(handle, hw_params, &sample_rate, &direction);
if (ret < 0) {
perror("snd_pcm_hw_params_set_rate_near");
goto failed;
}
MESSAGE();
/* step 7: 设置通道数量 */
ret = snd_pcm_hw_params_set_channels(handle, hw_params, channel);
if (ret < 0) {
perror("snd_pcm_hw_params_set_channels");
goto failed;
}
MESSAGE();
/* get the buffer time */
ret = snd_pcm_hw_params_get_buffer_time_max(hw_params, &buffer_time, 0);
MESSAGE("buffer_time:%d", buffer_time);
if (buffer_time > 500000)
buffer_time = 500000;
/* calc period time */
if (buffer_time > 0)
period_time = buffer_time / 4;
MESSAGE("period time:%d", period_time);
/* set period time */
if (period_time > 0)
ret = snd_pcm_hw_params_set_period_time_near(handle, hw_params, &period_time, 0);
MESSAGE("period time:%d", period_time);
MESSAGE("buffer time:%d", buffer_time);
/* set buffer time */
if (buffer_time > 0)
ret = snd_pcm_hw_params_set_buffer_time_near(handle, hw_params, &buffer_time, 0);
MESSAGE("buffer time:%d", buffer_time);
/* step 8: 设置hw_params参数 */
ret = snd_pcm_hw_params(handle, hw_params);
if (ret < 0) {
perror("snd_pcm_hw_params");
goto failed;
}
MESSAGE();
/* for debug info */
dump_hw_params(handle, hw_params, log);
#if 0
/* soft params */
snd_pcm_hw_params_get_period_size(hw_params, &chunk_size, 0);
snd_pcm_hw_params_get_buffer_size(hw_params, &buffer_size);
snd_pcm_sw_params_alloca(&swparams);
snd_pcm_sw_params_current(handle, swparams);
n = chunk_size;
ret = snd_pcm_sw_params_set_avail_min(handle, swparams, n);
n = buffer_size;
start_threshold = n + (double) sample_rate * start_delay / 1000000;
if (start_threshold < 1)
start_threshold = 1;
if (start_threshold > n)
start_threshold = n;
ret = snd_pcm_sw_params_set_start_threshold(handle, swparams, start_threshold);
stop_threshold = buffer_size + (double) sample_rate * stop_delay / 1000000;
ret = snd_pcm_sw_params_set_stop_threshold(handle, swparams, stop_threshold);
ret = snd_pcm_sw_params(handle, swparams);
/* for debug info */
snd_pcm_sw_params_dump(swparams, log);
#endif
return handle;
failed:
snd_pcm_close(handle);
return NULL;
}
void device_play(snd_pcm_t *pcm_handle, FILE *fp)
{
int ret = -1;
int size = 5512;
char *buffer;
int frame;
buffer = (char *) malloc(size);
MESSAGE("size=%d\n", size);
frame = size / 4;
while (1)
{
ret = fread(buffer, 1, size, fp);
if(ret == 0)
{
fprintf(stderr, "end of file on input\n");
break;
}
/* step 9: 写音频数据到PCM设备 */
// MESSAGE("fread ret:%d", ret);
while(ret = snd_pcm_writei(pcm_handle, buffer, frame)<0)
{
usleep(2000);
if (ret == -EPIPE)
{
/* EPIPE means underrun */
fprintf(stderr, "underrun occurred\n");
//完成硬件参数设置,使设备准备好
snd_pcm_prepare(pcm_handle);
MESSAGE();
}
else if (ret < 0)
{
fprintf(stderr, "error from writei: %s\n", snd_strerror(ret));
break;
}
}
}
MESSAGE();
}
void device_destroy(snd_pcm_t *pcm_handle)
{
//10. 关闭PCM设备句柄
snd_pcm_drain(pcm_handle);
snd_pcm_close(pcm_handle);
MESSAGE();
}
int main(int argc, char *argv[])
{
FILE *fp;
snd_pcm_t *handle; //PCM设备句柄pcm.h
if (argc != 2) {
printf("error: play [music name]\n");
return -1;
}
fp = fopen(argv[1], "rb");
if(fp == NULL)
return -1;
snd_output_stdio_attach(&log, stderr, 0);
handle = device_create();
device_play(handle, fp);
device_destroy(handle);
snd_output_close(log);
fclose(fp);
return 0;
}
wav报文头格式说明
https://docs.fileformat.com/audio/wav/
https://juejin.cn/post/6844904051964903431