Events: Data- och informationsteknik events at Chalmers University of TechnologyWed, 13 Jan 2021 16:36:47 +0100 Sadat Hoseini, Computer Science and Engineering<p>Zoom</p><p>Inference of Effective Pairwise Relations for Data Processing</p><div><br /></div> <div>In various data science and artificial intelligence areas, representation learning is a performance-critical step. While different representation learning methods can detect different descriptive and latent features, many representation learning methods reflect on pairwise relations. The thesis consists of two parts, studying pairwise relations from two points of view: i) Pairwise relations between the states of a Markov chain. ii) Pairwise relations between objects in a dataset based on a desired (dis)similarity measure. In the first part of the thesis, we consider Markov chains, noting that pairwise relations between its states are naturally modeled by the state-transition matrix. We propose a method for modeling the performance of a synchronization method for a multi-processor architecture. Our model introduces and builds upon a cache line bouncing process that models the interaction of threads accessing the shared cache lines. <br /></div> <div>In the second part of the thesis, we consider representation learning using the transitive-aware Minimax distance, which enables the extraction of elongated manifolds and structures in the data. While recent work has made Minimax distances computationally feasible, little attention has been put to its memory footprint, which is naturally O(N^2), the cost of storing all pairwise distances. We do, however, compute a novel hierarchical representation of the data, requiring O(N) memory, from which pairwise Minimax distances can then be efficiently inferred, in total requiring O(N) memory, at the cost of higher computational cost. </div> <div>An alternative sampling-based approach is also derived, which computes approximate Minimax distances, also in O(N) memory but with a significantly reduced computational cost, while still yielding a good approximation, as verified by impressive results on clustering benchmarks. </div> <div>Finally, we develop an unsupervised learning framework for clustering vehicle trajectories based on Minimax distances. The performance of the framework is validated on real-world datasets collected from real driving scenarios, on which satisfactory performance is demonstrated.</div> Angerd, Computer Science and Engineering<p>Zoom.</p><p>Approximation and Compression Techniques to Enhance Performance of Graphics Processing Units</p>​<span><br /><div>A key challenge in modern computing systems is to access data fast enough to fully utilize the computing elements in the chip. In Graphics Processing Units (GPUs), the performance is often constrained by register file size, memory bandwidth, and the capacity of the main memory. One important technique towards alleviating this challenge is data compression. By reducing the amount of data that needs to be communicated or stored, memory resources crucial for performance can be efficiently utilized.</div> <br /> This thesis provides a set of approximation and compression techniques for GPUs, with the goal of efficiently utilizing the computational fabric, and thereby increase performance. The thesis shows that these techniques can substantially lower the amount of information the system has to process, and are thus important tools in the process of meeting challenges in memory utilization.<br /><br /> This thesis makes contributions within three areas: controlled floating-point precision reduction, lossless and lossy memory compression, and distributed training of neural networks. In the first area, the thesis shows that through automated and controlled floating-point approximation, the register file can be more efficiently utilized. This is achieved through a framework which establishes a cross-layer connection between the application and the microarchitecture layer, and a novel register file organization capable of leveraging low-precision floating-point values and narrow integers for increased capacity and performance.<br /><br /> Within the area of compression, this thesis aims at increasing the effective bandwidth of GPUs by presenting a lossless and lossy memory compression algorithm to reduce the amount of transferred data. In contrast to state-of-the-art compression techniques such as Base-Delta-Immediate and Bitplane Compression, which uses intra-block bases for compression, the proposed algorithm leverages multiple global base values to reach a higher compression ratio. The algorithm includes an optional approximation step for floating-point values which offers higher compression ratio at a given, low, error rate.<br /><br /> Finally, within the area of distributed training of neural networks, this thesis proposes a subgraph approximation scheme for graph data which mitigates accuracy loss in a distributed setting. The scheme allows neural network models that use graphs as inputs to converge at single-machine accuracy, while minimizing synchronization overhead between the machines.<span style="display:inline-block"></span></span>äckström.aspx Bäckström, Computer Science and Engineering<p>Zoom</p><p>Adaptiveness and Lock-free Synchronization in Parallel Stochastic Gradient Descent</p><br /><div>The emergence of big data in recent years due to the vast societal digitalization and large-scale sensor deployment has entailed significant interest in machine learning methods to enable automatic data analytics. In a majority of the learning algorithms used in industrial as well as academic settings, the first-order iterative optimization procedure Stochastic gradient descent (SGD), is the backbone. However, SGD is often time-consuming, as it typically requires several passes through the entire dataset in order to converge to a solution of sufficient quality. In order to cope with increasing data volumes, and to facilitate accelerated processing utilizing contemporary hardware, various parallel SGD variants have been proposed. In addition to traditional synchronous parallelization schemes, asynchronous ones have received particular interest in recent literature due to their improved ability to scale due to less coordination, and subsequently waiting time. However, asynchrony implies inherent challenges in understanding the execution of the algorithm and its convergence properties, due the presence of both stale and inconsistent views of the shared state.</div> <br /><div>In this work, we aim to increase the understanding of the convergence properties of SGD for practical applications under asynchronous parallelism and develop tools and frameworks that facilitate improved convergence properties as well as further research and development. First, we focus on understanding the impact of staleness, and introduce models for capturing the dynamics of parallel execution of SGD. This enables (i) quantifying the statistical penalty on the convergence due to staleness and (ii) deriving an adaptation scheme, introducing a staleness-adaptive SGD variant MindTheStep-AsyncSGD, which provably reduces this penalty. Second, we aim at exploring the impact of synchronization mechanisms, in particular consistency-preserving ones, and the overall effect on the convergence properties. To this end, we propose LeashedSGD, an extensible algorithmic framework supporting various synchronization mechanisms for different degrees of consistency, enabling in particular a lock-free and consistency-preserving implementation. In addition, the algorithmic construction of Leashed-SGD enables dynamic memory allocation, claiming memory only when necessary, which reduces the overall memory footprint. We perform an extensive empirical study, benchmarking the proposed methods, together with established baselines, focusing on the prominent application of Deep Learning for image classification on the benchmark datasets MNIST and CIFAR, showing significant improvements in converge time for Leashed-SGD and MindTheStep-AsyncSGD.</div> Talks with Kerrie Mengersen<p>Zoom</p><p></p>​<span style="font-family:&quot;open sans&quot;"></span><span style="font-family:&quot;open sans&quot;;background-color:initial">​Februrary 3, 2021, 10:</span><span style="font-family:&quot;open sans&quot;;background-color:initial">00-11:00 pm (Swedish time)</span><div><div><div style="font-family:&quot;open sans&quot;;font-size:14px">Online, Zoom</div> <div style="font-family:&quot;open sans&quot;;font-size:14px"><br /></div> <div style="font-family:&quot;open sans&quot;;font-size:14px"><b>Zoom: </b><a href="">Sign up for the AI Talks mailing list​</a></div> <div style="font-family:&quot;open sans&quot;;font-size:14px"><b>Youtube:</b> <a href=""></a> </div> <div style="font-family:&quot;open sans&quot;;font-size:14px"> </div> <div><span style="font-family:&quot;open sans&quot;;font-size:16px;font-weight:600;background-color:initial">Title: </span><font face="open sans"><span style="font-size:16px"><b><span style="background-color:initial"></span><span style="background-color:initial">Calling all Citizen Scientists: using Bayesian Statistics to advance public input into scientific analysis</span></b></span></font></div> <div style="font-family:&quot;open sans&quot;;font-size:14px"> </div></div> <span style="font-family:&quot;open sans&quot;"></span><div><div><div style="font-family:&quot;open sans&quot;"></div> <div style="font-family:&quot;open sans&quot;;font-size:16px"></div> <p class="chalmersElement-P"><span style="font-family:&quot;open sans&quot;"><span style="font-weight:700">Abstract:</span></span><span style="font-family:&quot;open sans&quot;"> </span><span></span><span style="background-color:initial"><font face="open sans">Many scientific projects are benefitting strongly from the contribution of information by community members. For example, citizen scientists might classify arrays of ecological images or express risks about medical or social outcomes under hypothetical scenarios. However, these data can pose challenges for the scientist and analyst. Three such challenges are how to elicit the required information when this is in the form of expressed opinions, how to combine these opinions, and how to evaluate the credibility of crowd-sourced data. </font></span></p> <p class="chalmersElement-P"><span style="background-color:initial"><font face="open sans">In this presentation, I will describe the efforts of members of our research team to address these challenges, and set this work in the context of a number of case studies that motivate the research. In particular, I will focus on the use of Bayesian models to frame the problem, incorporate the various sources of information and express the desired probabilistic outcomes. These approaches include Bayesian Networks, spatial measurement-error and item-response models, and meta-analysis. </font></span></p> <p class="chalmersElement-P"><span style="background-color:initial"><font face="open sans">This research has been undertaken in collaboration with a range of colleagues who will be acknowledged in the presentation.</font></span></p> <p class="chalmersElement-P"> </p></div></div> <div><span style="font-family:&quot;open sans&quot;;font-size:16px"></span><div style="font-family:&quot;open sans&quot;"><span style="font-family:&quot;open sans&quot;, sans-serif;font-size:20px">About the speaker</span><br /></div> <div style="font-family:&quot;open sans&quot;"><span style="font-family:&quot;open sans&quot;, sans-serif;font-size:20px"><br /></span> </div> <div><font face="open sans"><img class="chalmersPosition-FloatLeft" src="/SiteCollectionImages/Centrum/CHAIR/events/Kerrie%20Mengersen.jpg" alt="" style="margin:5px 10px;width:180px;height:253px" />Kerrie Mengersen is a Distinguished Professor in Statistics at the Queensland University of Technology in Brisbane, Australia. She is the Deputy Director of the Australian Research Council Centre of Excellence in Mathematical Frontiers and the Director of the QUT Centre for Data Science. </font><font face="open sans"><br /></font></div> <div> </div> <div><font face="open sans">Distinguished Professor </font><font face="open sans">Mengersen is acknowledged to be one of the leading researchers in her discipline. </font><font face="open sans">Her research interests are in mathematical statistics and its application to substantive challenges in health, environment and industry, with particular focus on Bayesian methods. Professor Mengersen is also an elected Fellow of the Australian Academy of Science and the Australian Academy of Social Sciences, and a member of the Statistical Society of Australia and the IMS, ASA, RSS, ISBA and ISI.</font><span></span><br /></div> <div> </div> <div style="font-family:&quot;open sans&quot;"><a href=""></a></div> <div><br /></div> <span style="font-family:&quot;open sans&quot;;font-size:16px"></span></div></div> Ethics online: Refracting social norms through AI<p>Zoom</p><p>AI Ethics seminar with Ericka Johnson, professor of gender and society, Linköping University.</p><h2 class="chalmersElement-H2">Refracting social norms through AI</h2> <div>When new technologies are introduced to a social context they can be used as metaphorical lenses to help us see and become aware of norms and values that otherwise go unarticulated and unnoticed. AI works this way, too. This seminar starts with a discussion of how technologies can be used to make visible social norms and then moves to some specific examples taken from AI to ask what values (held by which actors) are becoming enrolled and enacted in the introduction and development of the technology. </div> <div><br /></div> <div><a href="">Link to registration</a>.</div> <div>Zoom link will be sent out to registered participants.</div> <div><br /></div> <div>You can find previous AI Ethics-seminars on Youtube.</div> <div><a href=""></a></div>'21 1st Workshop on AI Engineering<p>Virtual</p><p>​During the 43rd International Conference on Software Engineering (ICSE21) the 1st Workshop on AI Engineering – Software Engineering for AI – will be arranged.</p><br /><div>In development and implementation of AI-based systems , the main challenge is not to develop the best models/algorithms, but to provide support for the entire lifecycle – from a business idea, through collection and management of data, software development managing both data and code, product deployment and operation, and to its evolution. There is a clear need for specific support of Software Engineering for AI. </div> <br /><div>The aim of the workshop is to bring together researchers and practitioners in software engineering, in data-science and AI, and to build up a community that will target the new challenges emerging in Software Engineering that AI/data-science engineers and software engineers are facing in development of AI-based systems. The workshop will be highly interactive: In addition to the invited keynotes and short paper presentations, there will be several discussion sessions. We plan to combine local and remote participation.</div> <div><br /></div> <div><a href="">Go to the website</a> for information about the call for submissions and to find out more about topics of interest. Submission deadline is January 12 2021. <br /></div> <div><br /></div>