Bug #2824: New higher speed CRC code - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #2824

closed

New higher speed CRC code

Added by robin.carey1 over 10 years ago. Updated over 10 years ago.

Status:

Rejected

Priority:

Normal

Assignee:

Category:

Target version:

Start date:

06/09/2015

Due date:

% Done:

Estimated time:

Description

Dear DragonFlyBSD bugs,

This isn't really a bug. I noticed there is the possibility of improving
the performance of the recently committed new CRC code ("fast iscsi crc
code").

In the following function:

sys/libkern/icrc32.c
<http://gitweb.dragonflybsd.org/dragonfly.git/blob/d557434b1f5510b6fed895379af444f0d034c07b:/sys/libkern/icrc32.c>

static uint32_t
singletable_crc32c(uint32_t crc, const void *buf, size_t size) {
const uint8_t *p = buf;

while (size--)
               crc = crc32Table[(crc ^ *p++) & 0xff] ^ (crc >> 8);

return crc;
}

The two separate operations of "size--" and "*p++" could be combined into
one operation. The way that I would do that would be something like:

...
size_t I;
for (i = 0; i < size; ++i) {
crc = crc32Table[(crc ^ p[i]) & 0xff] ^ (crc >> 8);
}
...

So you would be saving one operation; performance improvement.

I haven't looked at the rest of the code, so perhaps there are other
performance improvements that could be had.

Hope this helps ...

--
Sincerely,

Robin Carey BSc

Actions

Copy link

Updated by alexh over 10 years ago

It doesn't save any operation/instruction with an optimizing compiler.

Even though it should be obvious, just to back it up with some real generated code, here go the critical loops of both versions (compiled with gcc -O3). The only difference is a 1-byte saving on the encoding of the xor. No real savings, and really no point in "optimizing" like that. The compiler does a better job :)

10:   48 83 c6 01             add    $0x1,%rsi
  14:   89 c1                   mov    %eax,%ecx
  16:   c1 e8 08                shr    $0x8,%eax
  19:   32 4e ff                xor    -0x1(%rsi),%cl
  1c:   0f b6 c9                movzbl %cl,%ecx
  1f:   33 04 8d 00 00 00 00    xor    0x0(,%rcx,4),%eax
  26:   48 39 d6                cmp    %rdx,%rsi
  29:   75 e5                   jne    10 &lt;singletable_crc32c+0x10&gt;

40:   89 c1                   mov    %eax,%ecx
  42:   32 0e                   xor    (%rsi),%cl
  44:   48 83 c6 01             add    $0x1,%rsi
  48:   c1 e8 08                shr    $0x8,%eax
  4b:   0f b6 c9                movzbl %cl,%ecx
  4e:   33 04 8d 00 00 00 00    xor    0x0(,%rcx,4),%eax
  55:   48 39 d6                cmp    %rdx,%rsi
  58:   75 e6                   jne    40 &lt;singletable_crc32c_carey+0x10&gt;

Actions

Copy link

Updated by alexh over 10 years ago

Status changed from New to Rejected

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

DragonFlyBSD

Bug #2824

New higher speed CRC code

Updated by alexh over 10 years ago

Updated by alexh over 10 years ago