Scientific queries are usually domain-specific and resource needs usually come with multiple constraints.
I finished my secondment in the Universitat Politècnica de València (UPV) on 31st December. Being able to do the secondment on-site is very important since seeing people in person and working closely everyday make the connection real. Discussions are much easier and efficient. And many valuable information and findings came from informal communication.
During the secondment, I mainly focused on notebook search (Notebook search refers to retrieving notebooks from a database given natural language queries). The members in CVBLab are actually the end-users of a notebook search system. To understand the intentions of researchers when searching for research resources, we collected 37 natural language queries that express their resource needs within CVBLab. Examples are “segmentation of epidermis in histopathological images”, “whole-slide images classification tensorflow code”,“Deep Kernel Learning” and “mitosis detection in histological images”.
By analyzing these queries (extracting the information of task, method, model, data type involved in queries; talking to experts who provide the queries), we have the following observations:
- Scientific queries are usually domain-specific, which means the used terms are professional and thus not covered by common knowledge.
- Resource needs usually come with multiple constraints. Taking “segmentation of epidermis in histopathological images” as an example, the desired notebooks must target the epidermis segmentation task and the data should be histopathological images.
I am sure the observations obtained from the communication with AI researchers in the CVBLab as well as the collected queries will largely help in designing an efficient notebook search system.
Figure 1. Exciting meet-up with other PhD students in UPV
Na Li – ESR1