Michael's Notes & Blog

2021-02-02

Basic Union Find in Rust

fn find(g: &mut Vec<usize>, i: usize) -> usize { if g[i] != i { g[i] = Self::find(g, g[i]); g[i] } else { i } } fn union_fn(g: &mut Vec<usize>, s: &mut Vec<usize>, i: usize, j: usize) { let i = Self::find(g, i); let j = Self::find(g, j); if i != j { if s[j] > s[i] { g[i] = j; s[j] += s[i]; } else { g[j] = i; s[i] += s[j]; } } }
2021-02-02

Get Data From AWS IoT Device Shadow and Store In Dynamodb

We need to implement an IoT platform on AWS, and we will be using device shadows to communicate between device and other parts such as Apps. One requirement for the plotform is that we need to access historic data. My plan is that I'm going to create a Rule that reads from device shadows and store d…

2021-01-12

Rust Async Notes

Due to the nature of notes, the content of this notes would be ustructured .await ing on a Multithreaded Executor Future may move between threads, so any variables used in async bodies must be able to travel between threads. So it is not safe to use Rc, RefCell or any other types that don't impleme…

2020-12-30

Atomic field updaters

Field updaters are essentially used as "wrappers" around a volatile field (primitive or object reference). They are generally used when one or both of the following are true: You generally want to refer to the variable "normally" (that is, without having to always refer to it vi…

2020-12-30

ReentrantLock in Java

The advantage if ReentrantLock over intrinsic Lock The thread waiting to acquire a lock can be interrupted if lockInterruptibly() is used A timeout can be set by using tryLock() Hand-over-Hand Locking is possible which enables efficient concurrent access on data structures like linked list Conditio…

2020-12-29T10:00:00

Tips on Using Locks

The content of this note is from the book "Seven Concurrency Models in Seven Weeks" Do Not Use an Object's Hash to Order Locks One piece of advice you'll often see is to use an object's hash code to order lock acquisition, such as shown here: if (System.identityHashCode(left) < System.…

2020-12-29

Correct Way to Do Double Locking

class Foo { private volatile Helper helper = null; public Helper getHelper() { Helper result = helper; if (result == null) { synchronized(this) { result = helper; if (result == null) { helper = result = new Help…

2020-12-29

Java Memory Model

How do final fields work under the new JMM? Assuming the object is constructed "correctly", once an object is constructed, the values assigned to the final fields in the constructor will be visible to all other threads without synchronization. In addition, the visible values for any other …

2020-12-29

Safe construction techniques

Don't publish the "this" reference during construction public class EventListener { public EventListener(EventSource eventSource) { // do our initialization ... // register ourselves with the event source eventSource.registerListener(this); } } Eve…

2020-08-10T14:53:00

High Performance Python

High Performance Python Benchmarking & Profiling Test & Benchmarking tiem command timeit module in Python pytest-benchmark timeit as a decorator timeit as a context manager Find Bottlenecks cProfile line_profiler memory_profiler Threads Synchronization Synchronization primitives join Lo…

2020-08-09T20:31:00

Threads in Python

Threads in Python Types of OS threads POSIX Windows user threads models 1 - 1 M - 1 M - N States New Runnable Running Not runnable Dead Create threads threading.Thread Inherit Thread class fork Apis setDaemon(True) threading.current_thread() threading.main_thread() threading.enume…

2020-08-09

Partitioning

Markmap * { margin: 0; padding: 0; } #mindmap { display: block; width: 100vw; height: 100vh; } ((t,a,e,n)=>{const{Markmap:o,loadPlugins:s}=window.markmap;(a?a(s,e,n):Promise.resolve()).then(()=>{o.create("svg#mindmap",null,t)})})({"t":"heading","d":1,"v":"Partitioning","c":[{"t"…

2020-07-29T16:15:05

Ordering Guarantees

Ordering and Causality The causal order is not a total order Linearizability is stronger than causal consistency Causal consistency is the strongest possible consistency model that does not slow down due to network delays, and remains available in the face of network failures. Sequence Number Ord…

2020-07-29T15:33:05

Linearizability

What is linearizability? The basic idea is make a system appear as if there were only one copy of the data, and all operations on it are atomic. What makes a system linerizable Read operation completes before a write operation begins, must return the old value Read operation begins after a write op…

2020-07-09T17:24:05

Leader Based Replication

In Leader Based Replication, one node is designated as the leader, which receives write requests from clients, writes the new data to its local storage and also sends the data to all of its followers. Synchronous Versus Asychronous Replication Synchronous Advantage: The follower is guaranteed to h…

2020-07-09T16:55:05

Multi-Leader Replication

A natural extension of the leader-based replication model is to allow more than one node to accepts, which requires Multi-Leader Replication. Within each datacenter, regular leader-follower replication is used; between datacenters, each datacenter's leader replicates its changes to the leaders in ot…

2020-07-03T20:22:00

Column-Oriented Storage

What is Column-Oriented Storage? In Contrast to Row-Oriented Storage, each column of data is stored in a seperate file, and all the column files store rows in the same order. Column Compression Values in a column often look quite repetitive, so they can be compressed. Bitmap Encoding we can take a c…

2020-07-02T17:54:00

SSTables and LSM-Trees

SSTables and LSM-Trees SStable stands for Sorted String Table, in which key sequence of key-value pairs is sorted by key. A compaction process is running in the background to merge SSTables and remove old values of keys. mergesort is used. In memory index is sparse, keys between two consecutive ind…

2019-12-10T12:07:00

Manage Home Directory

We share GPU box and we use anaconda to manage our Python environments these days. Then sooner or later we'll encounter the issue that the home directory of the box is full. In this case, we can use the following procedure to solve the problem cd ~ # Find out which file or directory uses the most s…

2019-12-09T16:33:05

OCR Summary

Service Providers & Libraries Service Providers Alibaba (Table Recognition & Recognition based on custom templates) Google (Text Detection && Document Text Detection, table recognition in alpha phase) Amazon: (Textract, with table recognition capability) [ABBYY]: Ocrolus Open Sourc…

2019-12-05

Table Detection

I am currently working on table detection and I will use this post to keep a memo of things I learned in the process System Requirement I finally installed detectron successfully, but I think that use a prebuilt docker image is the easiest way to get started, a Dockerfile is located in the docker fo…