Loss of token/document tensor at least with PDFMiner

Hello,

Thank you for this useful library !

# The issue
I had the following issue, with the following code :

``` python
import spacy
from spacypdfreader import pdf_reader

nlp = spacy.load("fr_core_news_sm")
doc = pdf_reader('9.PADD_SCOT RM.pdf', nlp)
doc.tensor
```

I get an empty tensor.

Wheras :
``` python
import spacy
from pdfminer import high_level

nlp = spacy.load("fr_dep_news_trf")
doc = nlp(high_level.extract_text(path))
doc.tensor
```
Returns the right tensor. 

# Reason

The issue seems to comes from the fact that pdf_reader processess each page as a document and uses [Doc.from_docs](https://spacy.io/api/doc#from_docs). It turns out that Doc.from_docs does not preserve Doc.tensor (but it is not found).



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss of token/document tensor at least with PDFMiner #9

The issue

Reason

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Loss of token/document tensor at least with PDFMiner #9

Description

The issue

Reason

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions