Resolved Best Way To Display Pdf On Screen?

Dalski

Active member
Joined
Jun 3, 2020
Messages
40
Programming Experience
Beginner
Shockingly there is nowhere near as much info out there on this than what you'd think there would be. I need to display a pdf as an image on screen. Hopefully most of the time they will be clean vectors, however sometimes they will be rasterized.

PDFSharp seems like the open-source solution. For my purposes I'll be treating them as images. I do not want the Adobe Preview Window with tools. I just want to display as an image really; with single-pixel thick lines.

PDF Sharp does not give nowhere near enough info Main Page - PDFsharp and MigraDoc Wiki.

Another ten year old solution Using Adobe Reader in a WPF app but I think I'd rather PDF Sharp.

Anyone know what my best route is here? I cannot believe there is no decent youTube tutorials/ articles on this fundamental topic. Perhaps because it is so vital developers have not written decent articles on it?
 
When you have bitmaps you can "stich" them yourself, either to another bitmap or directly on screen with Paint event. Graphics.DrawImage is used for this in both cases.
 
Thanks John, what about with vector pdfs? When I measure we can really notice if it's a rasterized pdf. Must be the same Graphics.DrawImage I think. Just a bit concerned as you specifically say bitmap so I'm being over-cautious.
 
With the libraries you get images (bitmaps) from pdf pages. When you have bitmaps you can do regular graphics operations with them.
 
The reason for this is that we receive blueprints of housing developments spread over several pages
I would expect this to be sent as a geographical surveyors data file rather than as PDFs made from that original surveyors data. Strange.
 
I would expect this to be sent as a geographical surveyors data file rather than as PDFs made from that original surveyors data. Strange.

Exactly right Skydiver, the issue we face is that the vast majority of clients will not send this data. I've had conversations with clients who refused to send CAD drg's because they though any changes made would affect their design & wreck the whole project LOL LOL. That's what we're up against!
They're also reluctant to send raw data because they are then liable for any inaccuracies in the model. So they would rather the client assumes all-risk by superfluously regenerating data from flat 2D pdfs. Insane isn't it! This will not be the case for much longer, the younger bucks will move into senior positions & things will change, but it's the norm at the moment.


With the libraries you get images (bitmaps) from pdf pages. When you have bitmaps you can do regular graphics operations with them.
I have used software in the past which has taken in a pdf, been able to scale & transform it's size, position & rotate etc, whilst putting the text on different layers & transforming that with the drg. Not rasterizing the text & leaving it as text which can be edited & easily selected. It was quite brilliant! If the pdf was converted to a bitmap then wouldn't the text then be converted to raster?
I've also used other software which displays pdfs with single-pixel lines; which was quite brilliant. I'd love to create my own software with the pros of different software I have used in the past; with none of their negatives.
 
So basically, you want vector graphics for the drawing lines, and text blocks for the text. How will you determine if a text block is text that happened to close to a graphics entity vs. text which is actually a call out associated with a graphics entity? It's been decades since I've read the PDF specs -- before Adobe made it open. I don't recall a way to make that distinction back then. Maybe it is available now?

Anyway, the first question would be if the PDFs contain bitmaps, or vector graphics. If bitmaps, then you'll need to do some image processing (using OpenCV perhaps?) to get the vector lines out of the bitmaps. Obviously, if the PDF contains vectors already, you'll want to use those. Then you'll need to pull out the text as well. Some OCR (using Tesseract perhaps?) for bitmaps, or just get the text blocks if vectors.

Once you have those, as you said, it's just linear algebra to do scaling, rotations, and other transforms. The lines will be rendered how ever you want to render them: 1 pixel width lines, or fancy scrollwork if you wish.

RE: single pixel lines. You probably know this, but most modern renderers do not draw true single pixel lines. To make the lines look better on the display, specially when the lines are diagonal or curved, they use anti-aliasing which may use 2-5 pixels of varying shades to draw a single edge that crosses from one raster row/column to the next.
 
To display Pdf you need either some kind of viewer control or convert to image. If need to manipulate the pdfs you need a different library.
 
So basically, you want vector graphics for the drawing lines, and text blocks for the text. How will you determine if a text block is text that happened to close to a graphics entity vs. text which is actually a call out associated with a graphics entity? It's been decades since I've read the PDF specs -- before Adobe made it open. I don't recall a way to make that distinction back then. Maybe it is available now?

I've never displayed a pdf before, but the text like you say is a different entity/ element. I've only browsed the Adobe spec manual but text is most definitely distinguishable to graphical elements. Not sure by what you mean call-out; maybe you're referring to a comment. Nonetheless I'm not too concerned with this as the text will be used for batch processing to which I can easily program a flag/ notificiation that vital text is missing.

John the only manipulation needed is to scale and transform, would think most of them have this built-in; I'm pretty sure pdfSharp does.
 
Yes, there are several Pdf editing libraries, iText 7 is another one. (used to be iTextSharp)
 
Not sure by what you mean call-out; maybe you're referring to a comment.
Yes. But often it'll be text in a box, and that box has a line that points to or is attached to a graphic element.
 
Spent most of the researching this further, I haven't been very clear in my opening post & apologies for the ambiguity. The main reason is I am learning; Using Adobe Reader in a WPF app is a great article & as I want to display a pdf whilst not loosing the text, & have access to the layers this would be great.

The article stipulates that a Windows Control Form is neccessary to display in WPF because it needs the Active-X control needs the WindowsHostForm element to work. My question is - as the article is 8 years old - is it still neccessary to insert a Windows Control Form. Could one not install a WPF User-Control or a more modern way to complete eradicate the Windows Form & have the app be entirely of WPF construction?
 
Yes, it is still needed. WPF does not have native ActiveX support, but WinForms does. Since WPF can host a WinForms control, that is how you can get an ActiveX control inside WPF.

If you don't want to do this, then you will have to provide all the infrastructure that is needed to support an ActiveX control inside WPF. It is really easy if the ActiveX control does not have to interact with the outer window, (only 3 or 4 interfaces to be implemented if I recall correctly), but it quickly gets more difficult when you need clipboard and menu interactions.
 
Last edited:
Back
Top Bottom