Open
Bug 462796
Opened 16 years ago
Updated 2 years ago
Add ARM optimizations to image decoders
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
NEW
People
(Reporter: pavlov, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 1 obsolete file)
3.83 KB,
patch
|
Details | Diff | Splinter Review | |
20.38 KB,
patch
|
Details | Diff | Splinter Review |
It would be great if we could do ARM optimized JPEG and PNG decoding, similar to the MMX and SSE2 code we have for them now.
Comment 1•16 years ago
|
||
Android has arm optimizations for libjpeg, but they appear to be Apache licensed. We should talk to google about getting them to relicense under the libjpeg license.
Comment 2•15 years ago
|
||
Stuart et al., we're seeing a pretty significant perf improvement with libjpeg-turbo on x86 (bug 573948)-- it looks like the hot path is running about 2.5x faster on all platforms. I imagine you'll see something similar with a good NEON library.
Comment 3•13 years ago
|
||
libpng recently added some ARM assembler code.
Latest libpng 1.5.9 just landed in Firefox, so it's a good time to try it.
To test it, you would need to define PNG_ARM_NEON in mozpngconf.h
and include arm/filter_neon.S file from libpng sources.
Comment 4•12 years ago
|
||
Hi all,
You can try external/libpng and external/zlib fast neon patch, provided by Code Aurora Forum. You can find the code in FFOS unagi device. These patch can improve 15% png decode performance.
Comment 5•12 years ago
|
||
(In reply to james.zhang from comment #4)
> Hi all,
>
> You can try external/libpng and external/zlib fast neon patch, provided
> by Code Aurora Forum. You can find the code in FFOS unagi device. These
> patch can improve 15% png decode performance.
Can you provide a link to these patches?
Comment 6•12 years ago
|
||
> Can you provide a link to these patches?
I think he's referring to code which lives in the B2G tree under the "external" directory.
The git repositories are
git://codeaurora.org/platform/external/zlib and
git://codeaurora.org/platform/external/libpng
I don't know if there are web interfaces to those repositories. They both appear to have a bit of NEON code in them (git grep neon).
Comment 7•12 years ago
|
||
libpng and zlib neon patch
Comment 8•12 years ago
|
||
add png_read_filter_row_neon.S
Comment 9•12 years ago
|
||
add inflate_fast_copy_neon.S
Comment 10•12 years ago
|
||
Please reference the attachment, we have verified the patch on android.
Comment 11•12 years ago
|
||
Glenn, DRC:
Have you seen these patches before (from CodeAurora)? Would you like to roll them upstream?
Comment 12•12 years ago
|
||
Oh, sorry - I thought there was a libjpeg part of this, but it's libpng only.
Comment 13•12 years ago
|
||
(In reply to Joe Drew (:JOEDREW! \o/) from comment #11)
> Glenn, DRC:
>
> Have you seen these patches before (from CodeAurora)? Would you like to roll
> them upstream?
See comment #3. This implementation looks to be equivalent to the one in libpng's arm directory, but not copied from it or based upon it. I can't tell by looking which is better, although this one is much more commented.
Comment 14•12 years ago
|
||
unagi device also has libjpeg neon patch. I'll provide the patch on B2G later.
git://codeaurora.org/platform/external/jpeg
Comment 15•12 years ago
|
||
(In reply to james.zhang from comment #14)
> unagi device also has libjpeg neon patch. I'll provide the patch on B2G
> later.
> git://codeaurora.org/platform/external/jpeg
Note that we use libjpeg-turbo, which has extensive NEON optimizations.
I'd be surprised if the CA code is faster than libjpeg-turbo, although if it is, I imagine DRC would be interested.
Comment 16•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #15)
> (In reply to james.zhang from comment #14)
> > unagi device also has libjpeg neon patch. I'll provide the patch on B2G
> > later.
> > git://codeaurora.org/platform/external/jpeg
>
> Note that we use libjpeg-turbo, which has extensive NEON optimizations.
>
> I'd be surprised if the CA code is faster than libjpeg-turbo, although if it
> is, I imagine DRC would be interested.
I think the CA code optimization function is different from libjpeg-turbo, so they can have both effect.
Comment 17•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #15)
> (In reply to james.zhang from comment #14)
> > unagi device also has libjpeg neon patch. I'll provide the patch on B2G
> > later.
> > git://codeaurora.org/platform/external/jpeg
>
> Note that we use libjpeg-turbo, which has extensive NEON optimizations.
>
> I'd be surprised if the CA code is faster than libjpeg-turbo, although if it
> is, I imagine DRC would be interested.
Sorry, libjpeg-turbo and the CA code optimize the same fuctions. I'll compare their performance and choose the better one.
Comment 18•12 years ago
|
||
James, do you have any performance numbers to report?
Comment 19•12 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #18)
> James, do you have any performance numbers to report?
About 15% performance improvement in png decode.
Comment 20•12 years ago
|
||
Is the 15% perf. improvement taken from Comment 4 or is it based on a new benchmark you did because of Jeff's question in Comment 18?
Comment 21•12 years ago
|
||
(In reply to Jorge Quiñónez from comment #20)
> Is the 15% perf. improvement taken from Comment 4 or is it based on a new
> benchmark you did because of Jeff's question in Comment 18?
Taken from Commnet 4, we verify this patch on Android Antutu benchmark, and test big png image decode. We did these benchmark last year.
Comment 22•12 years ago
|
||
It would be good to know how much of the performance improvement was due to the libpng patch and how much was due to the zlib patch. In the usual case where PNG scanline filters are all NONE, the libpng patch would provide no improvement.
Comment 23•11 years ago
|
||
Comment on attachment 690722 [details] [diff] [review]
png_read_filter_row_neon.S
This patch was made obsolete by checkin of libpng-1.5.17, bug #886499.
Attachment #690722 -
Attachment is obsolete: true
Updated•11 years ago
|
Hardware: x86 → ARM
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•