Abstract: The Science of Science field advances the measurement, evaluation, and prediction of scientific outcomes through the study of extensive scholarly data. For these purposes, bibliometrics is an appropriate approach that studies large volumes of scientific data using mathematical and statistical methods, and is widely used to assess the impact of papers and authors within a specific field or community. However, conducting bibliometric analyses poses several methodological, technical, and informational challenges (e.g., collecting and cleaning data, calculating indicators) which need to be addressed. This thesis aims to tackle some of these challenges and shed light on the factors influencing scientific impact, specifically focusing on open access publishing, international mobility, and influential factors on the h-index. This thesis tackles methodological contributions, such as author disambiguation and co-authorship network analysis, as they provide insights into methodological and informational challenges within bibliometric analysis. Another methodological challenge addressed in this research is the inference of gender for a significant number of authors to obtain gender-related insights. By employing gender inference techniques, the research explores gender as an influential factor in scientific impact, shedding light on potential gender inequalities within the scholarly community. The research employs a bibliometric approach and utilizes mainly Scopus, a comprehensive dataset encompassing various disciplines to make the following contributions:
• We explore the impact of publishing behavior, particularly the adoption of open access practices, on knowledge dissemination and scholarly communication. With this intention, we investigate the impact of journals flipping from closed access to open access publishing models . Changes in publication volumes and citation impact are analyzed, demonstrating an overall increase in publication output and improved citation metrics following the transition to open access. However, the magnitude of changes varies across scientific disciplines. In another study , we utilize a dataset of articles published by Springer Nature and employ correlation and regression analyses to examine the relationship between authors’ country affiliations, publishing models, and citation impact. Utilizing machine learning approach, we estimate the publishing model of papers based on different factors. The findings reveal different patterns in authors’ choices of publishing models based on income levels, availability of Article Processing Charges waivers, and journal rank. The study highlights potential inequalities in access to open access publishing and its citation advantage.
• We investigate the association between scholars’ mobility patterns, socio-demographic characteristics, and their scientific activity and impact. By utilizing network and regression analyses, along with various statistical techniques, we investigate the international mobility of researchers. Furthermore, we conduct a comparative analysis of scientific outcomes, considering factors such as publications, citations, and measures of co-authorship network centrality. The findings reveal gender inequalities in mobility across scientific fields and countries and positive correlations between mobility and scientific success.
• Centered on the prediction of scholars’ h-index as a metric of scientific impact, another one of our studies  employs machine learning techniques. We examine author, coauthorship, paper, and venue-specific characteristics, in addition to prior impact-based features. The results emphasize the significance of non-prior impact-based features, particularly for early-career scholars in the long term, while also revealing the limited influence of gender on h-index prediction.
The findings of this research hold implications for researchers, academic institutions, and policymakers aiming to advance scientific knowledge and foster equitable practices. By unviii covering the influential factors that shape scientific impact and addressing potential gender disparities, this research contributes to the broader objective of promoting diversity, inclusivity, and excellence within the scholarly community.