Blog

  • vitepress

    logo

    🔥 Edit on Vscode   ⚡️ Edit on StackBlitz

    目前已使用Gitee所支持设置的两种镜像:

    • Push:用于将 Gitee 的仓库自动镜像到 GitHub
    • Pull:用于将 GitHub 的仓库镜像到 Gitee

    镜像同步的代码是不计入被同步仓库的贡献度

    Gitee不再免费使用镜像同步功能将采取Github Actions实现代码推送Github自动同步到Gitee镜像仓库!/script/sync-gitee.yml

    tip: Gitee Pages服务部署的路径是全小写的,而Github Pages生成地址与仓库名称相关区分大小写

    持续集成

    GitHub 使用的是GitHub Actions持续集成服务

    Gitee: Gitee Go 是 Gitee 全新推出的一款 CI/CD 工具 我采取本地走脚本的方式

    Gitee Go 为增值服务,计费方式为预付费,按构建时长购买。付费企业套餐资费不包含 Gitee Go 等增值服务 😰

    推送master分支后自动部署

    name: Deploy
    on:
      push:
        branches:
          - master

    Algolia 爬虫于每周五3点开始

    name: Algolia
    on:
      schedule:
        - cron:  '0 3 * * 5'

    Algolia 免费版存在限制不能每次推送都使用否则

    Github Action Error: Crawling issue: nbHits 0 for XXX

    原因:You have exceeded your Free app’s 10,000 Record limit. You can delete records or indices, or upgrade at any time for increased capacity.

    tip: Schedule 在 GitHub 操作工作流运行的高负载期间,事件可能会延迟。高负载时间包括每小时开始。为了减少延迟的可能性,请安排您的工作流在不同的时间运行。

    从其他用户反馈延迟的时间为几十分钟,或者超过一个小时,甚至在某种极端情况下,将不会执行。

    所以 Schedule 设置的 cron 时刻,仅仅是工作流进行计划排队的时刻,而不是准确的运行时刻。而且上述时间均为UTC标准时间,不是北京时间。

    如果需要换算成北京时间,要在该cron的基础上增加八小时得到北京时间,例如 0 0 * * * 表示在每天 1:00 AM 触发 实际是在北京时间的 9:00 AM 才开始。

    • StackBlitz 直接操作 GitHub 触发仓库镜像功能再将操作同步至Gitee

    The “Open in StackBlitz” button

    One of the ways to make your code example stand out in your docs or your repository’s readme file is to use our CTA (call-to-action) buttons.

    Open in StackBlitz

    要允许所有 StackBlitz 项目使用第三方 Cookie,请转到浏览器的 Cookie 首选项,并为以下 URL 模式添加例外:

    https://[*.]stackblitz.io
    https://[*.]local.webcontainer.io
    https://[*.]local-credentialless.webcontainer.io
    https://[*.]local-corp.webcontainer.io
    

    使用 codespaces 在浏览器上编译运行

    "dev:codespace": "npm run dev -- --host 0.0.0.0"

    Dependabot version updates 可免费用于 GitHub.com 上的所有存储库。

    version: 2
    updates:
      - package-ecosystem: "npm" # See documentation for possible values
        directory: "/" # Location of package manifests
        schedule:
          interval: "monthly"
        commit-message:
          # Prefix all commit messages with "npm"
          prefix: "npm level up"
    Visit original content creator repository https://github.com/NidhoggDJoking/vitepress
  • csv_to_qlab

    CSV to QLAB

    To run on mac:

    • download csv_to_qlab.dmg from the latest release
    • unzip the foder
    • open the app
      • qlab must be open on the recieving computer in order for the messages to be recieved.

    Please note that I do not currently have an Apple Developer Certificate and therefore there will be some scary warnings when trying to run this application locally. It is entirely up to you to decide to run this application. If you have concerns with the bundled application releases, I suggest cloning or forking the repository.

    How to format your csv file:

    Some columns are required, some are optional.

    Required columns

    • Number
    • Type
    • Name

    Number Type Name
    12 start Cue 12 GO

    Optional Columns

    • Notes
    • Follow
      • 0 – No Follow
      • 1 – Auto-Continue
      • 2 – Auto-Follow
    • Color (Options)
    • Target
    • File Target
    • Columns available for “midi” cue type:
      • MIDI Q Number
      • MIDI Device ID
      • MIDI Message Type
        • 1 – MIDI Voice Message (“Musical MIDI”)
        • 2 – MIDI Show Control Message (MSC)
        • 3 – MIDI SysEx Message
      • MIDI Control Number
      • MIDI Control Value
      • MIDI Patch Channel
      • MIDI Patch Number
      • MIDI Q List
      • MIDI Command Format (Options)
      • MIDI Command (Options)
    • Columns available for “network” cue type:
      • QLab 5
        • Network Patch Number
        • Network Patch Channel
        • Custom String
      • QLab 4
        • Message Type (Options)
        • OSC Cue Number (Only if using QLab Message Type)
        • Command
          • For QLab Messages (Options)
          • For an OSC message, you may now include a raw string in this column

    Examples

    To run in development:

    python3 -m pip install --upgrade pip
    python3 -m pip install -r requirements.txt
    
    • Run:
    python3 application.py
    
    • The application was bundled for distribution using pyinstaller. To re-bundle, install pyinstaller:
    python3 -m pip install pyinstaller
    
    • Then run:
    pyinstaller application.spec
    

    If you want to run some tests:

    • Install Pytest
    pip install pytest
    
    • Run Pytest
    pytest
    

    Recomendations for future features are very welcome!

    Visit original content creator repository
    https://github.com/fross123/csv_to_qlab

  • influxdb

    swarmstack/influxdb

    Docker compose file for InfluxDB OSS version, also useful for Prometheus long-term storage.

    See https://github.com/swarmstack/victoria-metrics for highly performant time-series database that can be used for Prometheus metrics long-term storage, uses signficantly less RAM and supports higher-cardinality time-series data than the OSS version of InfluxDB.

    DEPLOY INFLUXDB AS A STACK

    INFLUXDB_ADMIN_USER='admin' \
    INFLUXDB_ADMIN_PASSWORD='admin' \
    INFLUXDB_USER='prometheus' \
    INFLUXDB_USER_PASSWORD='prompass' \
    docker stack deploy -c docker-compose.yml influxdb
    

    Or you can take some or all of the defaults above:

    docker stack deploy -c docker-compose.yml influxdb
    

    swarmstack users should use docker-compose-swarmstack.yml instead.

    PROMETHEUS REMOTE READ/WRITE DATABASE (Optional)

    Add remote-write and remote-read stanzas to your Prometheus configuration in order to use InfluxDB to store Prometheus metrics longer-term. swarmstack users can add the below to localswarmstack/prometheus/conf/prometheus.yml, otherwise just substitute http://influxdb with your Influx address:

    alerting:
      alertmanagers:
      - static_configs:
        - targets: [ 'alertmanager:9093', 'alertmanagerB:9093' ]
    
    remote_write:
      - url: "http://influxdb:8086/api/v1/prom/write?db=prometheus&u=prometheus&p=prompass"
    
    remote_read:
      - url: "http://influxdb:8086/api/v1/prom/read?db=prometheus&u=prometheus&p=prompass"
    

    GRAFANA DASHBOARD

    A dashboard which nicely visualizes all ‘internal’ Grafana OSS metrics documented at InfluxData.com and exported via influxdb_stats_exporter into Prometheus.

    Visit original content creator repository https://github.com/swarmstack/influxdb
  • llm-table-survey

    LLM-Table-Survey

    Table of Contents

    📄 Paper List

    Large Language Model

    • GPT-3, Language Models are Few-Shot Learners. NeurIPS 20. [Paper]
    • T5, Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. [Paper]
    • FLAN, Finetuned Language Models Are Zero-Shot Learners. ICLR 22. [Paper] [Code]
    • DPO, Direct Preference Optimization: Your Language Model is Secretly a Reward Model. NeurIPS 23. [Paper]
    • PEFT, The Power of Scale for Parameter-Efficient Prompt Tuning. EMNLP 21. [Paper]
    • LoRA, LoRA: Low-rank Adaptation of Large Language Models. ICLR 22. [Paper]
    • Chain-of-thought Prompting, Chain-of-thought prompting elicits reasoning in large language models. NeurIPS 22. [Paper]
    • Least-to-most Prompting, Least-to-most prompting enables complex reasoning in large language models. ICLR 23. [Paper]
    • Self-consistency Prompting, Self-consistency improves chain of thought reasoning in language models. ICLR 23. [Paper]
    • ReAct, ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 23. [Paper] [Code]

    Pre-LLM Era Table Training

    • TaBERT, TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. ACL 20 Main. [Paper] [Code]
    • TaPEx, TAPEX: Table Pre-training via Learning a Neural SQL Executor. ICLR 22. [Paper] [Code] [Models]
    • TABBIE, TABBIE: Pretrained Representations of Tabular Data. NAACL 21 Main. [Paper] [Code]
    • TURL, TURL: Table Understanding through Representation Learning. VLDB 21. [Paper] [Code]
    • RESDSQL, RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL. AAAI 23. [Paper] [Code]
    • UnifiedSKG, UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. EMNLP 22 Main. [Paper ] [Code]
    • SpreadsheetCoder, SpreadsheetCoder: Formula Prediction from Semi-structured Context. ICML 21. [Paper] [Code]

    Table Instruction-Tuning

    Code LLM

    Hybrid of Table & Code

    Parameter-Efficient Fine-Tuning

    Direct Preference Optimization

    • SENSE, Synthesizing Text-to-SQL Data from Weak and Strong LLMs. ACL 24. [Paper]

    Small Language Model + Large Language Model

    • ZeroNL2SQL, Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL. VLDB 24. [Paper]

    Multimodal Table Understanding & Extraction

    • LayoutLM, LayoutLM: Pre-training of Text and Layout for Document Image Understanding. KDD 20. [Paper]
    • PubTabNet, Image-Based Table Recognition: Data, Model, and Evaluation. ECCV 20. [Paper] [Code & Data]
    • Table-LLaVA, Multimodal Table Understanding. ACL 24. [Paper] [Code] [Model]
    • TableLVM, TableVLM: Multi-modal Pre-training for Table Structure Recognition. ACL 23. [Paper]
    • PixT3, PixT3: Pixel-based Table-To-Text Generation. ACL 24. [Paper]

    Representation

    • Tabular representation, noisy operators, and impacts on table structure understanding tasks in LLMs. NeurIPS 2023 second table representation learning workshop. [Paper]
    • SpreadsheetLLM, SpreadsheetLLM: Encoding Spreadsheets for Large Language Models. arXiv 24. [Paper]
    • Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies. EMNLP 23. [Paper] [Code]
    • Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs. arXiv 24. [Paper]

    Prompting

    NL2SQL

    • The Dawn of Natural Language to SQL: Are We Fully Ready? VLDB 24. [Paper] [Code]
    • MCS-SQL, MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation. [Paper]
    • DIN-SQL, DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction Prompting, Decompose. NeurIPS 23. [Paper] [Code]
    • DAIL-SQL, Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation. VLDB 24. [Paper] [Code]
    • C3, C3: Zero-shot Text-to-SQL with ChatGPT. arXiv 24. [Paper] [Code]

    Table QA

    • Dater, Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning. SIGIR 23. [Paper] [Code]
    • Binder, Binding language models in symbolic languages. ICLR 23. [Paper] [Code]
    • ReAcTable, ReAcTable: Enhancing ReAct for Table Question Answering. VLDB 24. [Paper] [Code]
    • E5, E5: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and Extrapolate. NAACL 24. [Paper] [Code]
    • Chain-of-Table, Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding. ICLR 24. [Paper]
    • ITR, An Inner Table Retriever for Robust Table Question Answering. ACL 23. [Paper]
    • LI-RAGE, LI-RAGE: Late Interaction Retrieval Augmented Generation with Explicit Signals for Open-Domain Table Question Answering. ACL 23. [Paper]

    Spreadsheet

    • SheetCopilot, SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models Agent. NeurIPS 23. [Paper] [Code]
    • SheetAgent, SheetAgent: A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models. arXiv 24. [Paper]
    • Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities. arXiv 24. [Paper]

    Multi-task Framework

    • StructGPT, StructGPT: A General Framework for Large Language Model to Reason over Structured Data. EMNLP 23 Main. [Paper] [Code]
    • TAP4LLM, TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning. arXiv 23. [Paper]
    • UniDM, UniDM: A Unified Framework for Data Manipulation with Large Language Models. MLSys 24. [Paper]
    • Data-Copilot, Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow. arXiv 23. [Paper] [Code]

    Tools

    • LlamaIndex
    • PandasAI
    • Vanna
    • DB-GPT. DB-GPT: Empowering Database Interactions with Private Large Language Models. [Paper] [Code]
    • RetClean. RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes. [Paper] [Code]

    Survey

    • A Survey of Large Language Models. [Paper]
    • A Survey on Large Language Model Based Autonomous Agents. [Paper]
    • Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks. [Paper]
    • Transformers for tabular data representation: A survey of models and applications. [Paper]
    • A Survey of Table Reasoning with
      Large Language Models. [Paper]
    • A survey on table question answering: Recent advances. [Paper]
    • Large Language Models(LLMs) on Tabular Data – A Survey. [Paper]
    • A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions. [Paper]

    📊 Datasets & Benchmarks

    Benchmarks

    Name Keywords Artifact Paper
    MBPP Code link arXiv 21
    HumanEval Code link arXiv 21
    Dr.Spider NL2SQL, Robustness link ICLR 23
    WiKiTableQuestions Table QA link ACL 15
    WiKiSQL Table QA,NL2SQL link arXiv 17
    TabFact Table Fact Verification link ICLR 20
    HyBirdQA Table QA link EMNLP 20
    FetaQA Table Fact Verification link TACL 22
    RobuT Table QA link ACL 23
    AnaMeta Table Metadata link ACL 23
    GPT4Table Table QA, Table-to-text link WSDM 24
    ToTTo Table-to-text link EMNLP 20
    SpreadsheetBench Spreadsheet Manipulation link NeurIPS 24
    BIRD NL2SQL link NeurIPS 23
    Spider NL2SQL link EMNLP 18
    Dr.Spider NL2SQL link ICLR 23
    ScienceBenchmark NL2SQL link VLDB 24
    DS-1000 Data Analysis link ICML 23
    InfiAgent-DABench Data Analysis link ICML 24
    TableBank Table Detection link LERC 20
    PubTabNet Table Extraction link ECCV 20
    ComTQA Visual Table QA, Table Detection, Table Extraction link arXiv 24

    Datasets

    Name Keywords Artifact Paper
    TableInstruct Table Instruction Tuning link arXiv 23
    WDC Web Table link WWW 16
    GitTables GitHub CSVs link SIGMOD 23
    DART Table-to-text link NAACL 21
    MMTab Multimodal Table Understanding link ACL 24
    SchemaPile Database Schemas link SIGMOD 24


    Visit original content creator repository
    https://github.com/godaai/llm-table-survey

  • sent

    sent is a simple plaintext presentation tool.

    sent does not need latex, libreoffice or any other fancy file format, it uses
    plaintext files and png images. Every paragraph represents a slide in the
    presentation.

    The presentation is displayed in a simple X11 window. The content of each slide
    is automatically scaled to fit the window and centered so you also don’t have to
    worry about alignment. Instead you can really concentrate on the content.

    Demo

    To get a little demo, just type

    make && ./sent example
    

    You can navigate with the arrow keys and quit with q.

    Usage

    sent FILE1 [FILE2 ...]
    

    If one FILE equals -, stdin will be read. Produce image slides by prepending a
    @ in front of the filename as a single paragraph. Lines starting with # will
    be ignored. A \ at the beginning of the line escapes @ and #. A
    presentation file could look like this:

    sent
    
    @nyan.png
    
    depends on
    - Xlib
    - libpng
    
    sent FILENAME
    one slide per paragraph
    # This is a comment and will not be part of the presentation
    \# This and the next line start with backslashes
    
    \@FILE.png
    
    thanks / questions?
    

    Visit original content creator repository
    https://github.com/mrinjamul/sent

  • heimdall

    heimdall

    Build Status Maintainability Test Coverage

    Heimdall is a self-hosted email alias/forwarding service. I built this as a privacy tool to fight spam and also better manage access to my personal email address. As a self-hosted and self-managed service, you have complete control over your data. With 3rd party email forwarding services, you are forced to trust a company with your emails.

    This has also been a really fun project for me to learn more about AWS and the Serverless framework.

    Check out: How I built Heimdall, an open-source personal email guardian.

    Changelog can be found under Releases.

    Motivations

    1. With Heimdall, you completely own and manage your data and the service. No feature limitations or having to trust a third-party company with your data.
    2. Heimdall is meant for individual users to deploy and use and contains user-friendly setup instructions.
    3. Heimdall is easy to run – it utilizes the idea of serverless computing, so there is zero server configuration or provisioning.
    4. Heimdall is easy to deploy – it uses the Serverless framework (not to be confused with small-letter serverless in Point 3 above) so you can deploy with a single command.

    Features

    Overview

    1. Receive safely: Receive emails on single-use aliases and forward them to your personal inbox.
    2. Reply anonymously: Reply to emails from your alias without revealing your personal email address.
    3. Attachments: Attachments are supported on incoming and outgoing emails (subject to size limits – see below).
    4. Email commands: Manage your aliases through email directly – no separate app or website required.
    5. Usage stats: Easily check the usage stats of each alias.

    Receiving emails

    Heimdall operates as a whitelisting (default-deny) service. All incoming emails to your domain are rejected by default unless they are to valid aliases. Emails received on valid aliases will be forwarded to your personal email address.

    Forwarded emails will preserve metadata information, such as any other recipients in the “to” or “CC” headers.

    Replying

    To reply, simply reply normally to the received email. Other recipients in the original email will not receive your reply.

    You may include other recipients in the “to” and “CC” list, either by manually inserting them, or using “reply-all”.

    Note: If you do that, you will disclose your email address to them. However, the original sender will still not be able to see your email address, provided you are replying to the original sender through the alias. The original sender will also not be able to see the other recipients.

    Attachments

    Attachments are supported, although size limits apply to the entire email message. This is a hard limitation imposed by AWS and cannot be circumvented. See Limitations below.

    Commands

    To interact with the service, send a single email to one of the following email addresses.

    Generate an alias

    Email generate@yourverifieddomain.com with the description as the subject. You will receive the generated alias as a reply.

    The description lets you identify an alias and its use. E.g. “Sign up for Service X”.

    Screenshot

    List aliases

    Email list@yourverifieddomain.com. You will receive a list of all aliases as a reply.

    Dev note: This reads up to a maximum of 1MB of data (due to AWS’s limitations).

    Remove an alias

    Email remove@yourverifieddomain.com with the alias as the title (case-sensitive). You will receive the operation outcome (success/failure) as a reply.

    Usage stats

    Email info@yourverifieddomain.com with the alias as the title (case-sensitive). You will receive usage information for the particular alias.

    Supported usage stats:

    • Alias creation date
    • Emails received
    • Emails sent
    • Date of last received email
    • Date of last sent email

    Update an alias

    Coming soon – not supported yet.

    Known Limitations

    Received emails must be <30MB. Outgoing emails must be <10MB.

    Setup

    Pre-requisites: You need to own a domain and have an AWS account. For reasonable use cases, you should not exceed AWS’s free tier (which is very generous). You should also already have Yarn and NodeJS installed.

    Optional: To be able to reply to emails, you need to request AWS Support to un-sandbox your SES account.

    1. Add and verify your domain in AWS Simple Email Service (SES).
    2. In AWS’s SES console, generate a set of SMTP credentials. Take note of that, and also your connection information on SES’s “SMTP Settings” page.
    3. Populate required environment variables in .env.sample, and rename to .env. It is important that EMAIL matches your personal email exactly. Also note that you should avoid port 25, due to AWS’s default blocking of outbound traffic.
    4. Run yarn global add serverless. Then, check out Serverless’s guide to set up Serverless’s credentials for accessing your AWS account programmatically.
    5. Run yarn install.
    6. Set up Serverless, then run yarn run deploy-prod.
    7. Add a receipt rule in SES to trigger your S3 bucket (created in step 6). For “recipients”, enter your domain name (e.g. yourverifieddomain.com). Preferably, name your rule descriptively (e.g. prod).

    Development (optional)

    If you want to build new features or tweak existing features, you can set up a parallel development environment that runs alongside production (above).

    1. Ensure that the DEV_SUBDOMAIN environment variable is set in .env (e.g. test).
    2. Run yarn run deploy-dev. This creates a parallel development CloudFormation stack.
    3. Add a new receipt rule in SES before your production rule to trigger your development S3 bucket. For “recipients”, enter the same test subdomain as you set in step 1 (e.g. test.yourverifieddomain.com). Preferably, name your rule descriptively (e.g. dev).

    Note: You need to update your DNS records for test.yourverifieddomain.com as you did when verifying your domain for AWS SES.

    Migration

    To run migration scripts, first compile using tsc scripts/migrate_vX.ts, then run using node scripts/migrate_vX.js.

    Visit original content creator repository https://github.com/fterh/heimdall
  • whitespacy

    Whitespacy

    Whitespacy is a polyglot formatter, written in Python, for the C and Whitespace programming languages.

    It takes as input a valid C file and a valid Whitespace file, and produces, as output, a polyglot file that is valid in both C and Whitespace, while behaving exaclty like the inputs when interpreted/compiled.

    Whitespacy also includes minic.py, a simple C-minifier.

    But why ?

    The goal of the project was to demonstrate that it is possible to embbed a fully functionnal Whitespace program within the whitespace characters ( , \t and \n) of a program written in another language.

    Is it useless ? For sure. Is it trivial ? Hell no.

    Dependencies

    Whitespacy only uses the standard libraries of Python. However, if you wish to compile the C files, you will need a C compiler like gcc or clang.

    To interpret the Whitespace files, I have used an online Whitespace interpreter.

    Example

    Let’s take as inputs this (nice) “Hello, World!” C program

    #include <stdio.h>
    
    #define NICE 69420
    
    int isNice(int x) {
        return x == NICE;
    }
    
    /* tricky quote " */
    #define min(x, y) \
    ((x) < (y) ? (x) : (y))
    
    int main() {
        printf("Hello, World!\n");
    
        if (isNice(3 * 4 * 5 * min(13, 31) * 89))
            printf("nice.\n");
    
        /* tricky //
           string */
        if (0)
            printf("/* */ \" // \
            ");
    
        return 0; // no error
    }

    and a basic “Hello, World!” Whitespace program (see hello-world.ws).

    Then, running the command

    $ python whitespacy.py hello-world.c hello-world.ws -o polyglot.c

    produces the polyglot.c file

     #             include<stdio.h>
        
    #  define            NICE       69420
        
     int isNice(int x)      {            return x==NICE;}
        
    # define min(x , y)( (x)<(y )?      (x):(y)   )
        
                                
        
                      
        
    int main( )   {   printf("Hello,\x20World!\n" )  
        
      ;if(isNice(3*                 4*5*        
    min(13,31   
      ) *89)) printf("nice.\n")     ;if  (          0   )printf
        ("/*\x20*/\x20\"\x20//\x20\x20\x20\x20\x20\x20\x20\x20\x20")
     ; return 0                  
        ;
                          
        
                      
        
                    
        
      
    
    
    }

    which, can be compiled with gcc (clang)

    $ gcc polyglot.c -o polyglot
    $ ./polyglot
    Hello, World!
    nice.

    or interpreted in Whitespace:

    Hello, world!

    It should be noted that the output of whitespacy.py is different for each execution, as (part of) the formatting is randomly generated.

    Visit original content creator repository
    https://github.com/francois-rozet/whitespacy

  • myanmar-phone-number-validator-ts

    📞 Myanmar Phone Number Validator 🇲🇲

    Validate and decode Myanmar phone numbers with ease using this TypeScript library! It’s an evolution of the original JavaScript library by Kaung Myat Lwin, now enhanced to fully support TypeScript. 🚀

    Installation 📦

    To install this package, simply run:

    npm install myanmar-phone-number-validator

    Usage 🛠️

    This package offers a myanmarPhoneNumber object packed with helpful functions:

    • isValidMMPhoneNumber(phoneNumber: string): boolean: Verifies if a string is a valid Myanmar phone number, returning true for valid and false for invalid numbers.

    import { myanmarPhoneNumber } from 'myanmar-phone-number-validator';
    
    const phoneNumber = '0949880111';
    if (myanmarPhoneNumber.isValidMMPhoneNumber(phoneNumber)) {
        // It's a valid phone number!
    } else {
        // Oops, invalid phone number!
    }
    • getTelecomName(phoneNumber: string): string: Retrieves the name of the telecom operator associated with a phone number, or “Unknown” if it can’t be determined.

    import { myanmarPhoneNumber } from 'myanmar-phone-number-validator';
    
    const phoneNumber = '0949880111';
    const telecomName = myanmarPhoneNumber.getTelecomName(phoneNumber);
    • getPhoneNetworkType(phoneNumber: string): string: Determines the network type of a phone number, returning “Unknown” if it can’t be determined.

    import { myanmarPhoneNumber } from 'myanmar-phone-number-validator';
    
    const phoneNumber = '0949880111';
    const networkType = myanmarPhoneNumber.getPhoneNetworkType(phoneNumber);

    License 📜

    This project operates under the MIT License.

    Credit 🙌

    Huge thanks to Kaung Myat Lwin for creating the original JavaScript library that inspired this one! 👏

    Visit original content creator repository
    https://github.com/minmyatoo/myanmar-phone-number-validator-ts

  • sentiment-puccini

    Sentiment Analysis on Puccini letters

    Project 7

    Puccini by mail

    In collaboration with the Ricordi Archive, Dr. Patrizia Rebulla and Valeria Luti

    In 2024 will be the centenary of the death of Giacomo Puccini, one of the greatest authors of Casa Ricordi.
    The great interest of Puccini is not only linked to his universal notoriety nor to the constant and still current success of his works, but also to his language, rich in Tuscanisms and inventions. The letters of Puccini can be studied from the perspective of sentiment analysis in order to link the text to several aspects of the temperament of Puccini, the moments of depression and discouragement from which he suffered periodically, his insecurity about his own abilities, certain difficulties in the relationship with his librettists.

    The Ricordi Archive keeps 381 letters written by Puccini to various recipients of Casa Ricordi, and 1387 letters sent to him by the publishing house. To these are added another 120/130 letters present in the database but not kept in the archive. In total, therefore, it is about 2000 letters to be analyzed.

    The project aims at studying the letters with aspect based sentiment analysis techniques in order to extract not only the general sentiment polarity but also specific aspects and opinions that may be associated with the events of the life of Giacomo Puccini.

    Dataset

    Provided by the Ricordi Archive link

    Project

    The aim of the project is the retrieval of Giacomo Puccini letters from official website, developing and application of pre-trained models for letters sentiment polarity classification over the years.
    Dataset Links: Archivio Ricordi and Sentipolc-evalita16

    Different models applied :

    • Simple Neural Network models
    • SentITA models

    Visit original content creator repository
    https://github.com/Andreaierardi/sentiment-puccini

  • optimize

    Dockerized script to bulk optimize images using libvips / sharp / bun.

    example terminal command output: Total: 23.74MB saved from 8 images

    Modes

    overwrite: Overwrite existing images (default). Scans the directory mounted to /images.

    docker run --rm -v ./images:/images -v ./backup:/backup henrygd/optimize
    

    restore: Restore original images from backup (reverses last overwrite operation).

    docker run --rm -v ./images:/images -v ./backup:/backup -e MODE=restore henrygd/optimize
    

    copy: Write images to different directory. This example converts all images to WEBP.

    docker run --rm -v ./images:/images -v ./optimized:/optimized -e MODE=copy -e FORMAT=webp henrygd/optimize
    

    Environment Variables

    Name Mode Description Default
    EXTENSIONS * Extensions to optimize1 jpg,jpeg,png,webp,tif,tiff
    FIT * Fit method inside
    FORMAT copy Output format2 unset
    JOBS * Number of parallel conversion jobs Based on available CPU cores3
    MAX_AGE * Age threshold in hours4 unset
    MAX_HEIGHT * Max height of output image 4000
    MAX_WIDTH * Max width of output image 4000
    MIN_SIZE * Size threshold in kilobytes5 unset
    MODE * Mode overwrite
    OWNER * Ownership of new files6 root:root
    QUALITY * Output quality 80
    QUIET * Log only errors, not every file unset

    Fit Methods

    • inside: Preserving aspect ratio, resize the image to be as large as possible while ensuring its dimensions are less than or equal to both those specified.
    • cover: Crop to cover both provided dimensions.
    • contain: Embed within both provided dimensions.
    • fill: Ignore the aspect ratio of the input and stretch to both provided dimensions.
    • outside: Preserving aspect ratio, resize the image to be as small as possible while ensuring its dimensions are greater than or equal to both those specified.

    Footnotes

    1. Uppercase versions of extensions are added automatically.

    2. This will force all optimized images to be converted to the specified format. Possible values: webp, avif.

    3. Default JOBS value is one fewer than half of your available cores. If you have 16 cores, it’s 7 jobs. If you have 4 cores or fewer, it’s only one job.

    4. Images are only optimized if its file content was modififed in the last MAX_AGE hours. For example, 24 would only optimize images updated in the last 24 hours.

    5. Images are only optimized if they are larger than MIN_SIZE. For example, 800 would only optimize images larger than 800kB.

    6. This applies only to newly created files. Overwritten files should maintain existing permissions. Value should use IDs. For example: -e OWNER=1000:1000, or -e OWNER="$(id -u):$(id -g)".

    Visit original content creator repository https://github.com/henrygd/optimize