Support more searchable file types in Elasticsearch, having a community manager - This week in Orchard (15/09/2023)

Gábor Domonkos's avatar
Announcement, Media Library, This week in Orchard

Add support for multiple media files in Elasticsearch and support more searchable file types in Elasticsearch! Let's see the details in our post and jump into a discussion about having a community manager!

Orchard Core updates

Add support for multiple media files in Elasticsearch

We support searching PDF files in Elasticsearch. We do this by reading the text of the PDF file found in Media Field and saving the text from the PDF file into a custom index in Elasticsearch. This is great, because now anytime a user searches for a text found in the PDF file, we return the content items where the PDF belongs. The problem is that we only store the text of the first attached PDF file, not all of it.

The solution is when trying to index a Text/String-based field, we should add all the values to the same index key. This will allow the content of multiple files to be indexed as expected.

Add support for multiple media files in Elasticsearch

Support more searchable file types in Elasticsearch

And while we are talking about improving Elasticsearch, here comes another goodie again by Mike Alhayek! Until this improvement, we only supported searching in PDF files in Elasticsearch, but we should be able to search .doc, .docx, .txt, .rtf, .ppt, .pptx. We can easily use Open XML SDK to add this support. The Open XML SDK provides tools for working with Office Word, Excel, and PowerPoint documents.

If you navigate to Configuration -> Features, you will find a new feature called Media Indexing, which extends the media indexing capability to also encompass searching within files with the following extensions .txt, .md, .docx, and .pptx.

If you check the files under the Indexing folder of the OrchardCore.Media library, you will find three new ones called PresentationDocumentMediaFileTextProvider, TextMediaFileTextProvider, and WordDocumentMediaFileTextProvider allowing you to index the content of the mentioned file types.

The new WordDocumentMediaFileTextProvider allows searching in Word documents

News from the community

Having a community manager

The community had a discussion last week about having a community manager. There is no person in the Orchard world who would pretty much deal with marketing and design for Orchard, like doing that for a product. Things that a marketing manager or a product owner or something like that does for a paid product. Orchard Core is not like that but for open-source projects and communities, a community manager is the person who does such things. Get the word out, and evangelize Orchard, not just in word-of-mouth but in a systematic way. Make marketing in general and shepherd the project, so it follows the requirements and the needs of the community, and the manager responds to what happens in the world and does things that developers don't really like to do. Let's have this discussion in detail! Do you agree, do you have people you think would be suitable? What do you think about this? Chime into the following discussion and tell us your opinion!

Having a community manager

Orchard Dojo Newsletter

Lombiq's Orchard Dojo Newsletter has 521 subscribers! We have started this newsletter to inform the community around Orchard of the latest news about the platform. By subscribing to this newsletter, you will get an e-mail whenever a new post is published to Orchard Dojo, including This week in Orchard of course.

Do you know of other Orchard enthusiasts who would like to read our weekly articles? Tell them to subscribe here!

If you are interested in more news about Orchard and the details of the topics above, don't forget to check out the recording of this Orchard meeting!

No Comments

Add a Comment