快速排序压缩算法是利用排序算法进行数据压缩的算法,与哈夫曼压缩相比有更快的压缩和解压速度(约为2倍),单次压缩率比哈夫曼压缩仅小1%可重复压缩2-3次。
其分为频度表和压缩数据两部份。频度表由字符和权重组成。压缩数据是排序中比较产生的数据。
第一部份统计字符种类和频率。利用变长码压缩。先输出原数据长度,字符种类数,再输出由变长码压缩的频率。
第二部份输出是快速排序算法产生的数据。算法使用改进的从左到右将数据分为小于基准值和大于基准值两组分组策略快速排序算法。小于基准值顺序写入原数据左边,大于基准值经附加数组写入原数据右边,并输出与基准值比较值(小于输出0,大于输出1)。在这之前先输出比较前数据是否有序的逻辑值(有序输出0返回,不再进行比较,未有序输出1,下一步进行比较。)
然后将第二部份0和1组成的数据压缩为字节数据。再将两部份数据合并形成完整数据。到此压缩结束。
解压先提取数据长度及字符频度表。并分配一个与数据长度相等的数组,该数组写入心0到数据长度的位置值。根报压缩数据比较值重新按排序算法走一遍,在数据交换的时候,交换前数组的位置值。
,然后定义一个数据数组根据频度表字符和频按顺序输出字符到数组。根据位置数组的位置值从0到数据长度输出数据数组相应位置的字符,到此解压结束。
原始的数据:
3,6,1,8,3,8,2,0,7,8,4,8,5,4,4,9,6,8,7,4,4,4,2,6,10,0,4,3,5,1,5,5,8,7,0,9,8,0,5,3,2,5,9,7,0,1,2,7,7,1,6,5,2,3,4,10,0,9,5,10,10,10,2,4,5,1,5,5,5,10,6,2,10,9,6,0,10,7,9,3,7,9,0,10,10,6,7,10,4,2,1,3,0,0,10,1,3,5,8,1,
还原的数据:
3,6,1,8,3,8,2,0,7,8,4,8,5,4,4,9,6,8,7,4,4,4,2,6,10,0,4,3,5,1,5,5,8,7,0,9,8,0,5,3,2,5,9,7,0,1,2,7,7,1,6,5,2,3,4,10,0,9,5,10,10,10,2,4,5,1,5,5,5,10,6,2,10,9,6,0,10,7,9,3,7,9,0,10,10,6,7,10,4,2,1,3,0,0,10,1,3,5,8,1,
数据序号 :
8,26,35,38,45,57,76,83,93,94,3,30,46,50,66,91,96,100,7,23,41,47,53,63,72,90,1,5,28,40,54,80,92,97,11,14,15,20,21,22,27,55,64,89,13,29,31,32,39,42,52,59,65,67,68,69,98,2,17,24,51,71,75,86,9,19,34,44,48,49,78,81,87,4,6,10,12,18,33,37,99,16,36,43,58,74,79,82,25,56,60,61,62,70,73,77,84,85,88,95,
排序的数据:
0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,10,10,10,10,10,10,10,10,10,10,10,10,
压缩的数据:
字节:21
{1,1,1,1,0,0,0,1,1,0,0,0,1,1,1,0,0,0,1,0,0}
字节:352 {0,1,0,1,0,1,0,0,1,1,0,1,0,0,0,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,0,0,0,1,1,0,0,0,1,1,0,1,0,0,0,0,1,0,1,0,1,1,1,0,0,0,0,0,0,0,1,1,0,1,1,1,0,1,1,1,0,1,1,0,1,1,1,1,1,0,0,0,0,0,0,1,0,0,0,1,0,1,0,1,0,0,1,1,1,1,1,1,1,0,0,1,1,1,0,1,1,0,0,1,1,0,1,0,0,0,0,1,0,1,1,0,1,0,1,1,0,1,1,1,0,0,1,0,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,1,1,0,1,0,0,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,1,1,1,1,0,1,1,0,0,1,0,1,1,1,1,0,0,0,0,1,0,0,1,1,1,1,1,1,1,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,1,1,1,1,1,1,0,1,1,0,1,0,1,0,1,1,1,0,0,1,1,0,0,1,1,0,1,1,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,1,1,1,1,0,0,0,1,1,0,1,0,1,0,0,1,0,1,1,1,1,1,0,1,0,0,1,1,1,1}
伪代码:
分组排序版压缩 (l, i - 1, 压缩字节, 压缩字节1)
{
key = l
.判断循环首 (key < r 且 mb [key] ≤ mb [key + 1])
key = key + 1
.判断循环尾 ()
.如果 (key = r)
写出字节集 (压缩字节1, { 0 })
返回 ()
.否则
.如果结束
写出字节集 (压缩字节1, { 1 })
分组压缩 (mb, mm, l, r, i, j, 压缩字节)
分组排序版压缩 (l, i - 1, 压缩字节, 压缩字节1)
分组排序版压缩 (j + 1, r, 压缩字节, 压缩字节1)
}
//-----------------------------
取基准值 (left, right, xmin, xmax)
{
.如果 (l < r)
.变量循环首 (l, r, 1, i)
字符表 [mb [i] + 1].权重 = 字符表 [mb [i] + 1].权重 + 1
.变量循环尾 ()
s = 0
.计次循环首 (256, i)
.如果真 (字符表 [i].权重 > 0)
s = s + 1
权重表 [s] = 合并长整数 (字符表 [i].权重, i)
.如果真结束
.计次循环尾 ()
重定义数组 (权重表, 真, s)
数组排序 (权重表, )
xmin = 长整数取高位 (权重表 [1]) - 1
xmax = 长整数取高位 (权重表 [s]) - 1
fl = 0
mks = r - l + 1
zi = 0
i = l
fl0 = 0
.变量循环首 (1, s, 1, i)
fl = fl + 长整数取低位 (权重表 [i])
mk = mks ÷ 2
.如果真 (fl ≥ mk 且 zi = 0)
.如果 (取绝对值 (fl - mk) ÷ 2 ≤ 取绝对值 (mk - fl0))
zi = i
.否则
zi = i - 1
.如果结束
' zi = i
跳出循环 ()
.如果真结束
fl0 = fl
.变量循环尾 ()
.如果真结束
.' 如果真 (zi = 0)
' zi = 1
.如果真结束
返回 (长整数取高位 (权重表 [zi]) - 1)
.否则
返回 ( xmax)
}
//---------------------------------------------------------------------------------
分组 (mb, mm, l, r, i, j, 压缩字节)
{
key = 取基准值 (left, right, xmin, xmax)
i = left
j = right
s = left
m = left
.判断循环首 (s ≤ right)
.判断开始 (a [s] ≤ key) ' 小数左到右写入原数组
写出字节集 (压缩字节, { 0 })
a [i] = a [s]
i0 = i
jp = 0
i = i + 1
.判断 (a [s] > key) ' 大数右到左写入辅助数组
写出字节集 (压缩字节, { 1 })
b [j] = a [s]
ip = 0
j = j - 1
.默认
.判断结束
s = s + 1
.判断循环尾 ()
' 按先后顺序从j+1开始左到右将大数写入原数组
s = right
p = j + 1
.判断循环首 (s > j)
a [p] = b [s]
s = s - 1
p = p + 1
.判断循环尾 ()
}
//----------------------------------------------------------------------------------
分组排序版解压 (l, i - 1, 压缩字节, 压缩字节1)
{
.如果 (l < r)
' 调试输出 (l, r)
读入数据 (压缩字节1, c)
.如果 (c = 0)
返回 ()
.否则
.如果结束
分组解压 (mm, bb, l, r, i, j, 压缩字节)
分组排序版解压 (l, i - 1, 压缩字节, 压缩字节1)
分组排序版解压 (j + 1, r, 压缩字节, 压缩字节1)
}
//--------------------------------------------------
分组解压 (mm, bb, l, r, i, j, 压缩字节)
{
i = left
j = right
s = left
m = left
.判断循环首 (s ≤ right)
读入数据 (压缩字节, c)
.判断开始 (c = 0) ' 小数左到右写入原数组
a [i] = a [s]
i = i + 1
.判断 (c = 1) ' 大数右到左写入辅助数组
b [j] = a [s]
j = j - 1
.判断结束
s = s + 1
.判断循环尾 ()
' 按先后顺序从j+1开始左到右将大数写入原数组
s = right
p = j + 1
.判断循环首 (s > j)
a [p] = b [s]
s = s - 1
p = p + 1
.判断循环尾 ()
}
//----------------------------------------------------
.计次循环首 (数组数量, i)
mm [i] = i
解压输出[i]=0
.计次循环尾 ()
移到文件首 (压缩字节)
移到文件首 (压缩字节1)
分组排序版解压 (1, 数组数量, 压缩字节, 压缩字节1)
s = 1
.计次循环首 (取数组成员数 (字符表), i)
c = 字符表 [i].字符
.计次循环首 (字符表 [i].权重, n)
mb [s] = c
s = s + 1
.计次循环尾 ()
.计次循环尾 ()
.计次循环首 (数组数量, k)
解压输出[mm [k]] = mb [k]
.计次循环尾 ()